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Authors ' abstract : 

A small set of constructs can simulate a wide variety of 
apparently distinct features in modern programming languages. 
Using a kernel language called Pebble based on the typed lambda 
calculus with bindings, declarations, and types as first-class 
values, we show how to build modules, interfaces and 
implementations, abstract data types, generic types, recursive 
types, and unions. Pebble has a concise operational semantics 
given by inference rules. 

R. Burstall 
and B . Lampson 



Capsule review: 

Programming-language designers have invented a variety of 
language extensions and special notations to deal with several 
problems that arise in programming in the large. Some of the 
differences among such features in Ada, CLU, Euclid, Mesa, ML, 
Modula, Russell, SML, et al. are superficial; others are 
fundamental. Without a uniform semantic framework it is 
difficult to compare and evaluate these features, or to 
determine which choices are arbitrary and which are tightly 
constrained. 

Pebble is a simpler language, intended for the precise 
description of language constructs. It is used to explain 
strongly typed module interconnection languages, abstract data 
types, and procedures that are parameterized with respect to the 
types of operands. It is based on the typed lambda calculus, 
extended to encompass the linking together of separately checked 
modules into a program. Bindings, declarations, and types -- as 
well as functions -- are all treated as first-class values; the 
type system includes dependent types. 

This paper presents an informal overview of why the approach can 
be expected to work. But the precise definition of the features 
of existing languages in terms of Pebble is left as "an exercise 
for the reader . " 

The semantics of Pebble are presented both informally and 
formally. Representative cases are presented in great detail, 
to illustrate the workings of the formalism. 



Jim Horning 
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1. Introduction 

Programming language designers have invented a number of features to support the writing of 
large programs in a modular way which takes advantage of type-checking. As languages have 
grown in size these features have been added to the basic structure of expressions, statements 
and procedures in various ad-hoc fashions, increasing the syntactic and semantic complexity of 
the language. It is not too clear what the underlying concepts or the language design options 
are. In particular cases various kinds of parameterised types or modules are offered, and it is 
unclear how these are related to the ideas of function definition and application, which can be 
formalised very simply in the lambda calculus. 

This paper describes a small programming language called Pebble, which provides a precise 
model for these features. It is a functional language, based upon the lambda calculus with 
types. It is addressed to the problems of data types, abstract data types and modules. It also 
deals with the idea of generic values. It does not reflect all aspects of programming languages, 
since we have not dealt with assignment, exceptions or concurrency, although we believe these 
could be added to our framework. Our intention is that it should be possible to express the 
semantics of a sizeable part of a real programming language by giving rules which rewrite it 
into Pebble. This follows the method used by Bauer and his colleagues [Bauer et al. 1979] to ex- 
press the semantics of their wide spectrum language. We were particularly concerned with the 
Cedar language (an extension of Mesa [Mitchell et al. 1979]) which is in use at Xerox. One of 
us (BL) has defined the quite complex part of this language which is concerned with data types 
and modules in terms of rewrite rules which convert Cedar to an earlier version of Pebble; this 
work is described in an unpublished report 

Practical motivation 

A principal idea which we wish to express in our formalism is the linking together of a number 
of modules into a large program. This idea may be summarized as follows; Each program 
module produces an implementation of some collection of data types and procedures. In order 
to do so it may require the implementations supplied to it by some other modules. This traffic 
in implementations is controlled by interfaces which say what kind of implementation is 
required or produced by a module. These interfaces name the data types and specify the 
argument and result types of the procedures. Given a large collection of modules, perhaps the 
work of many people at different times, it is essential to be able to express easily different ways 
of connecting them together, that is, ways of providing the implementations needed by each 
module. An input interface of a module may be satisfied by the implementations produced by 
several different modules or different "versions" of the same module. 

We believe that linking should not be described in a primitive and ad hoc special-purpose 
language; it deserves more systematic treatment. In our view the linking should be expressed in 
a functional applicative language, in which modules are regarded as functions from 
implementations to implementations. Furtliermore this language should be typed, and the 
interfaces should play the role of types for the implementations. Thus we have the 
correspondence: 



implementation?^ value 
interface ;=^type 
module^ function 

Function application is more appropriate for linking than schemes based on the names of the 
modules and the sequence in which they are presented. By choosing suitable structured types 
in a functional language we can get a simple notation for dealing with "big" objects (pieces of 
a program) as if they were "small" ones (numbers); this is the basic good trick in matrix 
algebra. Thus we hope to make "Programming in the Large" look ver>' much like 
"Programming in the Small". 

Another advantage of this approach to linking is that the linking language can be incorporated 
in the programming language. We hope in this way to achieve both conceptual economy and 
added flexibility in expressing linking. By contrast, the usual approach to the linking problem, 
exemplified by Mesa and C-Mesa [Mitchell et al. 1979], has a programming language (Mesa) 
with a separate and different linking language (C-Mesa) which sits on top of it so to speak. The 
main advantage of this approach is that a separate linking language can be used for linking 
modules of more than one programming language, though in the past this advantage has been 
gained only at the price of using an extremely primidve linking language. 

A linking system called the System Modeller was built by Eric Schmidt for his Ph.D. thesis 
work, supervised by one of us (BL). He used an earlier version of Pebble with some modifica- 
tions, notably to provide default values for arguments since these are often obvious from the 
context [Schmidt 1982, Lampson and Schmidt 1983]. The System Modeller was used by several 
people to build large systems, but the implementation has not been polished sufficiently for 
widespread use. 

Our other practical motivation was to investigate how to provide polymorphic functions in 
Cedar, that is ones which will work uniformly for argument values of different types; for ex- 
ample, a matrix transpose procedure should work for integer matrices as well as for real 
matrices. 

Outline of the paper 

We start from Landin's view of programming languages as lambda calculus sweetened with syn- 
tactic sugar [Landin 1964]. Since we are dealing with typed languages, we have to use typed 
lambda calculus, but it turns out that we need to go further and extend the type system with 
dependent types. We take types as values, although they only need to be handled during type- 
checking (which may involve some evaluation) and not at execution time. We thus handle all 
variable binding with just one kind of lambda expression. Another extension is needed 
because, whilst procedures accept n-tuples of values, for example (1, 5, 3), at the module level 
it is burdensome to rely on position in a sequence to identify parameters and it is usual to 
associate them with names, for example (x-1, >'-'5, z^3). This leads us to the notion of a 
binding. To elucidate the notion of parameterised module we include such bindings as values 
in Pebble. It turns out that the scoping of the names which they contain does not create 
problems. 



To give a precise semantics of Pebble we give an operational semantics in the form of in- 
ference rules, using a formalism due to Plotkin [1981], with some variations. We could have 
attempted a denotational semantics, but tliis would have raised theoretical questions rather 
different from our concerns about language design. So far as we know it would be quite 
poGGGible to give a satisfactory denotational semantics for Pebble, and we should be interested 
to learn of anyone attempting this task. Our semantics gives rules for type-checking as well as 
evaluation. Our rules are in fact deterministic and hence can be translated into an interpreter 
in a conventional programming language such as Pascal. We give a fragment of such a 
translation in § 5.4. 

Related work 

Our work is of course much indebted to that of others. Reynolds, in a pioneering effort, 
treated the idea of polymorphic types by introducing a special kind of lambda expression 
[Reynolds 1974], and McCracken built on this approach [McCracken 1979]. The language 
Russell introduced dependent types for functions and later for products [Demers and Donahue 
1980]. MacQueen and Sethi have done some elegant work on the semantics of a statically typed 
lambda calculus with dependent types, using the idea that these should be expressed by quan- 
tified types; this idea of universally and existentially quantified types was introduced in logic 
by Girard [Girard 1972] and used by Martin-Lof [Martin-Lof 1973] for the constructive logic 
of mathematics. Mitchell and Plotkin seem to have each independently noted the usefulness of 
existentially quantified types for explaining data abstraction. We had already noticed this 
utility for dependent products, learning later of the work on Russell and the connection with 
quantified types. It is a little hard to know who first made these observations; they seem to 
have been very much "in the air". 

The main difference of our approach from that using quantified types is that we take types as 
values and have only one kind of lambda expression. Russell also takes types as values, but 
they are abstract data types with operations, whereas we start with types viewed as simple predi- 
cates without operations, building more complex types from this simple basis. The idea of 
taking bindings as values also appears in [Plotkin 1981] with a somewhat similar motivation. 
Our work has been infiuenced by previous work by one of us with Goguen on the design of 
the specification language Clear [BurstaU and Goguen 1977]. 

Ackno wledgem en ts 

We would like to thank a number of people for helpful discussions over an extended period, 
particularly Jim Donahue, Joseph Goguen, David MacQueen, Gordon Plotkin, Ed 
Satterthwaite and Eric Schmidt. Valuable feedback on the ideas and their presentation was ob- 
tained from members of IFIP Working Group 2.3. Much of our work was supported by the 
Xerox Palo Alto Research Center. Rod Burstall also had support from the Science Research 
Council, and he was enabled to complete this work by a British Petroleum Venture Research 
Fellowship. 



5 



2. Informal description of Pebble 

This section describes the language, with some brief examples and some motivation. We first 
go through the conventional features such as expressions, conditionals and function definitions. 
Then we present those which have more interest: 

the use of bindings as values witli declarations as their types; 

the use of types as values; 

the extension of function and product types to dependent types; 
the method of defining polymorphic functions. 
Finally we say something about type-checking. 

The reader may wish to consult the formal description of the values and the formal syntax, 
given in § 4, when he is unclear about some point. Likewise the operational semantics given in 
§ 5 will clarify exact details of the type-checking and evaluation. 

2/ Bask features 

Pebble is based upon lambda calculus with types, using a fairly conventional notation. It is en- 
tirely functional and consists of expressions which denote values. 

We start by describing the values, which we write in this font. They are: 

• primitive values: integers and booleans; 

• function values; primitive operations, such as +, and closures, which are the values of 
lambda expressions; 

• tuples: nil and pairs of values, such as [1, 2]; 

• bindings: values such as x~3 which associate a name with a value, and fix bindings which 
arise in defining recursive functions; 

• types: 

the primitive types int and bool, 
types formed by X and 
dependent types formed by * and 

the type type which is the type of all types including itself, and 
declarations, such as x: int, which are the types of bindings; 

• applications: primitive functions applied to arguments which need simplification, written 
primitivelvalue, and symbolic applications ffoe which arise during type-checking. These 
are not final values of expressions, but are used in the formal semantics. 

We now consider the various forms of expressions, leaving aside for the moment the details of 
bindings, declarations, and dependent types, which will be discussed in later sections. These 
are as follows: 



• applications: these are of the foiTn "operator operand", for example factorial 6, with juxta- 
position to denote application. Parentheses and brackets are used purely for grouping. If 
£| is an expression of type ^1^/2 ^2 expression of type /|, then is an ex- 
pression of type As an abbreviation we allow infixed operators such as x-\-y for -f [x, y\, 

• tuples: nil is an expression of type void. If £*j is an expression of type and £2 of 
type then [E^, E^ is an expression of type The brackets are not significant and 
may be omitted. The functions fst and snd select components, thus fst[l, 2] is 1. 

• conditionals: IF E^ THEN £2 ELSE Ey where £| is of type bool. 

• local definitions: LET BVSE evaluates E in the environment enriched by the binding B, 
For example 

LET x\ int'->'+z IN x-f mod x 

first evaluates y+z and then evaluates x4-mod x with this value for x The int may be 
omitted, thus 

LET x:'->'+zIN... 

The binding may be recursive, thus 
LET REG/: int-*int - . . . IN . . . 

We allow £ WHERE 5 as an abbreviation for LET B IN £. 

• function definitions: Functions are denoted by lambda expressions, for example 

X x: in t-^ int IN x+mod x 

which when applied to 3 evaluates 3 + mod 3. If evaluates to /j, T2 evaluates to and E 
is an expression of type (2 provided that TV is a name of type then 

is a function of type t^~^t2. Functions of two or more arguments can be defined by using 
X, for example 

X x: int X /. bool -♦int IN . . . 
We allow the abbreviation /: (/: int-^int) IS ... for / int->int X /: int-^int IN . . . 

An example may help to make this all more digestible: 

LET REC fact : (n: int-^int) IS 

IF n = OTHEN 1 ELSE n*fact{n-l) 
INLET^:-2 + 2 + 2lN 
faci fstlJt,k+l]) 

This all evaluates to factorial 6. Slightly less dull is 

LET /K7ce^(/*:jnt->int)-*(int->int) IS 

X rt:int-*int IN /(/■«) 
\yi (twice hi) [[1,21, 31 

which evaluates to fst(fst([[l, 2], 3], that is 1. We shall see later how we could define a 
polymorphic version of twice which would not be restricted to integer functions. 

The reader will note the omission of assignment. Its addition would scarcely affect the syntax, 
but it would complicate the formal semantics by requiring the notion of store. It would also 



complicate the rules for type-checking, since in order to preserve static type-checking, we 
would have to make sure that types were constants, not subject to change by assignment. This 
matter is discussed further in § 3.6. 

22 Bindings and declarations 

An unconventional feature of Pebble is that it treats bindings, such as x~3, as values. They 
may be passed as arguments and results of functions, and they may be components of data 
structures, just like integers or any other values. The expression x: int~3 has as its value the 
binding X'-S. A binding is evaluated by evaluating its right hand side and attaching this to the 
variable. Thus if x is 3 in the current environment, the expression y\ int^x+1 evaluates to the 
binding y~4. The e xpr e sssio n x\ int'-S may be written more briefly x.^^i. 

The type of a binding is a declaration. Thus the binding expression x:^3 has as its type the 
declaration x: int. Bindings may be combined by pairing, just like any other values. Thus 
[x:-'3, /ji-'true] is also a binding. After LET such a complex binding acts as two bindings "in 
parallel'*, binding both x and b. Thus 

LET x:-0 IN LET [x:-3, -x] IN [x, y] 

has value [3, 0] not [3, 3], since both bindings in the pair are evaluated in the outer environ- 
ment. Thus the pair constructor is just like any other function. The type of the binding 
[jci-S, b\-\sut] is (x: int)X(tf: bool), since as usual if has type and ei has type /2 then 
[ej, has type /1X/2. 

For convenience we have a syntactic sugar for combining bindings "in series". We write this 
which is short for [B^, LEF IN ^2]. There are no other operations on bindings, with 
the possible exception of equality which could well be provided. 

Declarations occur not only as the types of bindings but also in the context of lambda expres- 
sions. Thus in 

\ x: int-^int IN x+1 

x\ int is a declaration, and hence x\ int-^int is a type. In fact you may write any expression 
after the A provided that it evaluates to a type of the form d^t where ^ is a declaration. To 
make two argument lambda expressions we simply use a X declaration, thus 

X x: int X y. int -> int IN x+>' 

which is of type intXint-*^int, and could take [2, 3] as an argument. This introduces a certain 
uniformity and flexibility into the syntax of lambda expressions. 

We may write some unconventional expressions using bindings as values. For example, 

LET b\-{x\-l) IN LET 6 IN X 

which evaluates to 3. Another example is 

LET/:-^\ (x: int X int)->int IN 
LET6INx+j^)n 

which also evaluates to 3. Here /takes as argument not a pair of integers but a binding. 
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The main intended application of bindings as values is in elucidating the concept of parameter- 
ised module. Such a module delivers a binding as its result; thus, a parameterised module is a 
funciion from bindings to bindings. Consider a module which implements sorting, requires as 
parameter a function lesseq on integers, and produces as its result functions issoried and sorL It 
could be represented by a function from bindings to bindings whose type would be 

(lesseq: intXint->bool) (issoried: list int-*bool)X(w/: list int list int) 
We go into this in more detail in § 3.1. 

Pebble also has an anti-LET, which impoverishes the environment instead of enriching it: 

IMPORT Nl^E 

evaluates E in an environment in which is the only name which is bound For example: 

LET A^: - 5 IN IMPORT N IN LET A' IS x 

The value of this expression is the value of x in the binding 5, if x is indeed bound by 5. 
Otherwise it has no value. This is very useful if 5 is a named collection of values from which 
we want to obtain the one named x. Without IMPORT, if x is missing from B we would pick up 
any x that happens to be in the current environment. IMPORT is so useful that we provide the 
syntactic sugar B$x for it 

2.3 Types 

We now explain how the kernel language handles types. It may be helpful to begin by dis- 
criminating between some of the different senses in which the word *type' is customarily used. 
We use ADT to abbreviate 'Abstract Data Type*. 

• Predicate type — simply denoting a set of values. 
Example: bool considered as {true, false}. 

• Simple ADT -a single predicate type with a collection of associated operations. 
Example: stack with particular operations: 

push: intXstack—^ stack'^ . . etc. 

• Multiple ADT— several predicates (zero or more) with a collection of associated operations. 
Example: point and line with particular operations: 

intersection: lineX line—*' point'', . ., etc. 

• ADT declaration — several predicate names with a collection of associated operation names, 
each having inputs and outputs of given predicate names. 

Example: predicate names point and line with operator names: 

intersection: lineX line--*' point, etc. 

The simple ADT is a special case of the multiple ADT which offers notational and other con- 
veniences to language designers. For the ADT declaration we may think of a collection of 
(predicate) type and procedure declarations, as opposed to the representations of the types and 
the code for the operations. 

Some examples of how these concepts appear in different languages may help. The last column 
gives the terminology for many sorted algebras. 
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Pascal CLU Mesa Ada Russell ML Algebra 
Predicate type type type type — type sort 



Simple — cluster — — type — algebra 

ADT 

Multiple - - imple- package - abstract algebra 

ADT mentation body type 

ADT — — interface package — — signature 
declaration spec 



In Pebble we take as our notion of type the first of these, predicate types. Thus a type is simply 
a means of classifying values. We are then able to define entities which are simple ADT's, mul- 
tiple ADT's and ADT declarations. To do this we make use of the notions of binding and 
declaration already explained, and the notion of dependent type explained below. 

Pebble treats types as values, just like integers and other traditional values. We remove the 
sharp distinction between "compile time" and "run time", allowing evaluation (possibly 
symbolic) at compile time. This seems appropriate, given that one of our main concerns is to 
express the linking of modules and the checking of their interfaces in the language itself. 
Treating types as values enriches the language to a degree at which we might lose control of 
the phenomena, but we have adopted this approach to get a language which can describe the 
facilities we find in existing languages such as Mesa and Cedar. A similar but more 
conservative approach, which maintains the traditional distinction between types and values, is 
being pursued by David MacQueen at Bell Labs, with some collaboration of one of us (RB), 
He has recendy applied these ideas to the design of a module facility for ML [MacQueen 1984]. 
The theoretical basis for this work has been developed in [MacQueen and Sethi 1982, 
MacQueen, Plotkin and Sethi 1984]. 

2.4 Polymorphism 

A function is said to be polymorphic if it can accept an argument of more than one type; for 
example, an equality function might be willing to accept either a pair of integers or a pair of 
booleans. To clarify the way Pebble handles polymorphism we should first discuss some dif- 
ferent phenomena which may be described by this term. We start with a distinction (due we 
believe to C. Strachey) between ad hoc and universal polymorphism. 

Ad hoc polymorphism -the code executed depends on the type of the argument, e.g. 
'print 3' involves different code from 'print "nonsense"'. 

Universal polymorphism -the same code is executed regardless of the type of the argu- 
ment, since the different types of data have uniform representation, e.g. reverse (1, 2, 3, 4) 
and reverse (true, false, false). 

We have made this distinction in terms of program execution, lacking a mathematical theory. 
Recently Reynolds has offered a mathematical basis for this distinction [Reynolds 1983], 



In Pebble we take universal polymorphism as the primitive idea. We are able to program ad 
hoc polymorphic functions on this basis (see § 3.3 on generic types). But universal 
polymorphism may itself be handled in two ways: explicit parameierisaiion or unification. 

Explicit parameterisation- when we apply the polymorphic function we pass an extra 
argument (parameter), namely the type required to determine the particular instance of 
the polymorphic function being used. For example, reverse would take an argument t 
which is a type, as well as a list. If we want to apply it to a list of integers we would supply 
the type int as the value of /, writing revers€{\m){\, 2, 3, 4) and reverse{hoo\){ixut, false, 
false). To understand the type of reverse we need the notion of dependent type, to be 
introduced later. This approach is due to Reynolds [Reynolds 1974] and is used in Russell 
and CLU. 

Unification -the type required to instantiate the polymorphic function when it is applied 
to a panicular argument need not be supplied as a parameter. The type-checker is able to 
determine it by inspecdng the type of the argument and the type of the required result. A 
convenient and general method of doing this is by using unification on the type expres- 
sions concerned [Milner 1978]; this method is used in ML [Gordon, Milner and 
Wadsworth]. For example we may write reversal, 2, 3, 4). Following Girard [Girard 1972] 
we may regard these type variables as universally quantified. The type of reverse would 
then be Tor all /; type . list(/) list(/).' This form is used by MacQueen and Sethi [1982]. 

In Pebble we adopt the explicit parameterisation form of universal polymorphism. This has 
been traditional when considering instantiation of modules, as in CLU or in Ada generic types. 
To instantiate a module we must explicitly supply the parameter types and procedures. Thus 
before we can use a generic Ada package to do list processing on lists of integers, we must in- 
stantiate it to integers. The pleasures of unification polymorphism as in ML seem harder to 
achieve at the module level; in fact one seems to get involved with second order unification. 
This is an open area for research. It must be said that explicit parameterisation makes program- 
ming in the kernel language more tedious. We hope to avoid the tedium in future versions of 
Pebble by sugar which automadcally supplies a value for die type parameter. 

For example, we might want to define a polymorphic function for reversing a pair, thus 
irH'a/?[int, bool][3, true] 

which evaluates to [true, 3]. Here swap is applied to the pair of types [int, bool] and delivers a 
function whose type is intXbool-^boolXint. The type of swap is a dependent type; we will 
explain this in the next section, and then we will be able to define the function swap. 

15 Dependent types 

We now consider the idea of dependent type [Girard 1972, Demers and Donahue 1980]. We 
will need two kinds of dependent type constructor, one analogous to for dealing with func- 
tions, the other analogous to X for dealing with pairs. We start with the former. 
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We might think naively that the type of swap would be 
{typeXtype)-*(/jXf2-^;2^/^) 

but of course this is nonsense because the type variables /| and tj are not bound anywhere. The 
fact is that ihe type of the result depends on (he values of the arguments. Here the arguments are 
a pair of types and t^ and /2 are the names for these values. We need a special arrow in- 
stead of to indicate that we ha\ e a dependent type; to the left of the we must declare 
the variables t^ and So the type of swap is actually the value of 

Uy type X t^: iypc)-^ Ht^Xt^-^tp^i^) 

In order to have only one name-binding mechanism, we take this value to be 
(f^:typeXf2:type)^c 

where c is the closure which is the value of 
X/j! type X t^: type-^typc IN U^Xt^-*t^Xt^) 

and ► is a new value constructor for dependent function types. For example, the type of 
wflp[int, bool] is intXbool-^boolXint. 

We may now define swap by 

syvap: {tylype X t^:typc)~*'>{t^Xt2-^t^Xt^) IS 
X Xyt^ X Xyt^-^i^Xi^ IN [xj. x^) 

Another example would be the list reversing function 

REG reverse: (rtype)-* >(to t^list t) IS 

X /: list t^lisi t IN IF /=nil THEN /ELSE append [reverse tail I, [head I, nil]] 

A similar ph e nomemon occurs with the type of pairs. Suppose for example that the first ele- pUevl^^•*^fev^^)^v 

ment of a pair is to be a type and the second element is to be a value of that type; thus [int, 3] 

and [bool, false] denote such pairs. The type of all such pairs may be written (/: type)XX/. As 

we did with -*^>, we take its value to be {ty type X type)*c where c is the closure which is ^-^^^^^ f^^^^t" 

the value of \t: type-*^type IN t, and * is a new value constructor for dependent product 

types. It is a dependent type because the type of the second element depends on the value of the 

first. Actually it is more convenient technically to let this type include all pairs whose first 

element is not just a type but a binding of a type to t. So expressions of type (t: type)XX/ are 

[/i'-int, 3] and [/:~bool, false] for example. 

A more realistic example might be 

Automaton: type (input: type X state: type X output: type) XX 

({inputXstate-^ state) X (state-^ output)) 

Values of the type Automaton are pairs, consisting of 

(i) three types called input, state and output', 

(ii) a transition function and an output function. 

By "three types called input, state and output'' we mean a binding of types to these names. 



26 Type-checking 



Given an expression in Pebble, we first type-check it and then evaluate it. However, the type- 
checking will involve some evaluation; for example, we will have to evaluate subexpressions 
which denote types and those which make bindings to type variables. Thus there are two dis- 
tinct phases of evaluation: evaluation during type-checking and evaluation proper to get the 
result value. These both follow the same rules, but evaluation during type-checking may make 
use of symbolic values at times when the actual values are not available; this happens when we 
type-check a lambda expression. 

For each form of expression we need 

(i) a type-checking rule with a conclusion of the form: E has type /. 

(ii) an evaluation rule with a conclusion of the form: E has value e. 

The type-checking rule may evoke the evaluation rules on subexpressions, but the evaluation 
rule should not need to invoke type-checking rules. 

For example, an expression of the form LET ... IN ... is type-checked using the following rules. 

The type of LET 5 IN £ is found thus: 

If the type of B is void then it is just the type of £. 

If the type of J? is N\ /q then it is the type of £ in a new environment computed thus: 

evaluate B and let be the right hand side of its value. 

the new environment is the old one with N taking type /q and value e^. 
If the type of B is d{Kd2 then 

evaluate B and let be the second of its value; 

now the result is the type of LET fst B IN LET b2 IN £. 

If the type of 5 is a dependent type of the form Jj*/then this must be reduced to the 
previous d{Xd2 case by applying/to the binding fst B to get d^^ 

The type of a binding of die form D^E is: 
the value of D if 

it is void and E has type void, 
or if it is N\t and E has type t, 

or if it is d{Kd2 and ^'-fst e, rf2'^snd e] has type ^/jXc/j; 

otherwise, if the value of Z) is a dependent type of the form d^i^f, then this must be re- 
duced to die d{Kd2 case by applying/to the binding fst E) to get dl^. 

The type of a recursive binding REC D^E is just the value of Z), provided that a somewhat com- 
plicated check on the type of E succeeds. 



The type of a binding which is a pair is calculated as usual for a pair of expressions. 



The value of a binding of the form D^-E is as follows: 
If the value of D is void then nil. 

If the value of D is N:t then N^e, where e is the value of E. 

If the value of D is dy<d2 then the value of (J^'-fst E, dj^^ml E). 

If the value of Z) is a dependent type then we need to reduce it to the previous case (as 
before). 

A couple of examples may make this clearer. We give them as informal proofs. The proofs are 
not taken down to the lowest level of detail, but display the action of the rules just given. 

Example: 

LET x: intXint - [1 + 1, 0] IN fst x 

has type int (and value 2). To show this, we first compute the type of the binding. 
x: intXint [1 + 1, 0] has type x: intXint because 

x: intXint has type type and 

x: intXint has value x: intXint and 

[1 + 1,0] has type intXint 
This is of the form N: /, so we evaluate the binding. 

x: intXint - [1 + 1, 0] has value x~[2, 0] 

We type-check fst x in the new environment formed by adding [x: intXint] and [x^[2, 0]]. 
In this environment fst x has type int. This is the type of the whole expression. 

Here is a second rather similar example in which LET introduces a type name. It shows why it 
is necessary to evaluate the binding after the LET, not just type-check it. We need the ap- 
propriate binding for any type names which may appear in the expression after IN. Here i in 
/: type- int is such a name, and we need its binding to evaluate the rest of the expression. 

Example: 

LET /; typc-intlN 
LET jc: /-I IN x+1 

has type int (and incidentally value 2). We first type-check the binding of the first LET ... IN. 

t: type-int has type r. type and value f-int 

In the new environment formed by adding [t: type] and [f-^int] we must type-check LET jc: t^l 
IN x+1. This has type int because 

x: t^l has type x: int and 

x: t-^l has value x-^l and 

in the new environment formed by adding [x: int] and [x~1], jc+1 has type int. 



What about type-checking lambda expressions? For expressions such as 
X x: int-*int IN jc-f 1 

this is straightforward. We can simply type-check x+1 in an environment enriched by [x: int]. 
But we must also consider polymorphic functions such as 

X /: type (t-^t) IN X x: t-*t IN E 

We would like to know the type of jc when type-checking the body but this depends on the 
argument supplied for /. However we want the lambda-expression to type-check no matter 
what argument is supplied, since we want it to be universally polymorphic. OthenA'ise we 
would have to type-check it anew each time it is given an argument, and this would be 
dynamic rather than static type-checking. So we supply a dummy, symbolic value for / and use 
this while type-checking the rest of the expression. That is, we type-check 

in an environment enriched by [v, type] and [f-nev\/constant], where newconstant is a sym- 
bolic value of type type, distinct from all other symbolic values which we may invent. Under 
this regime a function such as 

X r type (t-^t) in X x: r mn x 

will type-check (it has type denoted by /: type-> >(/-►/)) but 
X v. type (/-►/) IN X x: / / in x+ 1 

will fail to type-check because it only makes sense if t is int. 

Thus it is necessary that at type-checking time evaluation can give a symbolic result, since we 
may come across a nev\/constant. How do we apply one of the primitives to such a value? It 
will simply produce a value w\e which cannot be simplified. But what if the operator is sym- 
bolic? We introduce a special value constructing operator % to permit the application of a sym- 
bolic function to an argument. So if /is symbolic the result of applying /to e is just /%e. This 
enables us to do symbolic evaluation at compile time and to compare types as symbolic 
expressions. 



3, Applications 



This section presents a number of applications of Pebble, mainly to programming in the large: 
interfaces and implementations, and abstract data types. We also give treatments of generic 
types, union types, recursive types such as list, and assignment. The point is to see how all 
these facilities can be provided simply in Pebble. 

3 J Interfaces and implementations 

The most impoiiant recent development in programming languages is the introduction of an ex- 
plicit notion of interface to stand between the implementation of an abstraction and its clients. 
To paraphrase Pamas: 

An interface is the set of assumptions that a programmer needs to make about another 
program in order to show the correctness of his program. 

Sometimes an interface is called a specification (e.g., in Ada, where the term is package 
specification). We will call the other program an implementation of the interface, and the 
program which depends on the interface the client. 

In a practical present-day language, it is not possible to check automatically that the interface 
assumptions are strong enough to make the client program correct, or that an implementation 
actually satisfies the assumptions. In fact, existing languages cannot even express all the assump- 
tions that may be needed. They are confined to specifying the names and types of the 
procedures and other values in the interface. 

This is exactly the function of a definition module in Mesa or Modula 2, a package specifica- 
tion in Ada, or a module type in Euclid. These names and types are the assumptions which the 
client may make, and which the implementation must satisfy by providing values of the proper 
types. In one of these languages, we might define an interface for a real number abstraction as 
follows: 

interface Real: 
type real\ 

function plu^x: real\ y\ real): real\ 
end 

and an implementation of this interface, using an existing type float, might look like this: 

implementation implements Real\ 
type real—Jloai\ 

function ptu^x: real, y\ real): real, 
begin 

if . . . then . . . else . . . end; 
return . . .; 
end; 

end 

In Pebble an interface such as Realxs simply a declaration for a type Real%realmd various func- 
tions such as plus\ an implementation of Real is a binding whose type is Real Here is the 
interface: 



/?pa/: type- {real: type XX 

plus: (realX real tea f) X . . .); 

Note that this is a dependent type: the type of plus depends on the value of Real$reaL 

Now for the implementation, a binding with type Real. It gives real the value Jloat, which must 
denote some already -e.xisfing type, and it has an explicit X-expression for plus. 

RealFl: Real-- [real:^ Jloat \ 

plus:- A x: real X y: real^real IN (IF ... THHN ... ELSF. ...) , . . .] 

On this foundation we can define another interface Complex, with a declaration for a mod func- 
tion which takes a ComplexScomplex to a RealFl$reaL 

Complex: iyp^ ^ {complex: type XX 

mod: complex-^ R ealFli real X . . .) 

If we don't wish to commit ourselves to the RealFl implementation, we can define a 
parameierized interface MakeComplex, which takes a Real parameter: 

MakeComplex: (R: R«2/)->type IS 

(complex: type X X 
mod: complex— ^ R%r€al X . . .) 

Then the previous Complex can be defined by 

Complex: type ^ MakeComplexiRealFf) 

This illustrates the point that a module is usually a function producing some declaration or 
binding (the one it defines) from other declarations and bindings (the interfaces and imple- 
mentations it depends on). 

Now the familiar cartesian and polar implementations of complex numbers can be defined, 
still with a Real parameter. This is possible because the implementations depend on real num- 
bers only through the elements of a binding with type Real: the real type, the plus function, etc. 

MakeCariesian: {R: Real^> MakeComplex{R)) IS 

[complex: ^ R%real X R%real ; 

mod:-- \c: complex-^ RSreal IS RSsqrt{{htc)^ + (sndc)^) , ...1; 

MakePolar: {R: Real^>\fakeComplex{R)) IS 

[complex: R%real X RSreal ; 

mod:" X c: complex— ^ RSreal IS fst c , . . .]; 

These are functions which, given an implementation of Real, will yield an implementation of 
MakeComplexiReaf). To get actual implementations of Complex (which is MakeComplex{RealF[)\ 
we apply these functions: 

Cartesian:Complex'' MakeCartesian{RealFt)\ 
Polar: Complex" MakePolar{RealFt)\ 

If we don't need the fiexibility of different kinds of complex numbers, we can dispense with 
the Make functions and simply write: 

Cartesian: Complex'^ [complex: ^ RX R \ 

mod:-- X c: complex^ R IN RealFlisqrt{{fst c)^4-(snd c)^) , . . .1, 

Polar: Complex^ [complex:^ RX R : 

mod:" X c: complex-*' R IN fst c » . . .] 

m\V.KE R:"RealFl%real 

To show how far this can be pushed, we define an interface Transform which deals with real 
numbers and two implementations of complex numbers. Among other things, it includes a map 
function which takes one of each kind of complex into a real. 



Transform\(R: Real XX CI: MakeComplex{R) X C2: MakeComplexiR) -*lypc)lS 

(map: (Cl%complex X C2%complex — ► R%rea[) X ... ) ; 

Note how this declaration requires CI and C2 to be based on the same implementation of ReaL 
An implementation of this interface would look like: 

TransformCP: Transform(realFl Cartesian, Polar)^ 

[map\^ A C/; Cartesian%complexXC2: PoIarScomplex-^ RealFlSreal \>i 

IF...THI:N... ELSE ]; 

Thus in Pebble it is easy to obtain any desired degree of flexibility in defining interfaces and 
implementations. In most applications, the amount of parameterization shown in these ex- 
amples is not necessar>\ and definiLions like the simpler ones for Cartesian and Polar would be 
used. 

We leave it as an exercise for the reader to recast the module facilities of Ada, CLU, Euclid and 
Mesa in the forms of Pebble. 

3.2 Abstract data types 

An abstract data type glues some operations to a type; e.g., a stack with push, pop, top etc. 
Clients of the abstraction are not allowed to depend on the value of the type (e.g., whether a 
stack is represented as a list or an array), or on the actual implementations of the operations. In 
Pebble terms, the abstract type is a declaration, and the client takes an implementation as a 
parameter. Thus 

intStackDecl: type - {stk\ type XX 

empty, stk X 
isEmpty: (sik-^hooX) X 
push: (int X stk-^stk) X 
top: {sik^'mi) X . . . ) 

is an abstract data type for a stack of ints. We have used a dependent XX type to express the 
fact that the operations work on values of type stk which is also part of the abstraction. We 
could instead have given a parameterized declaration for the operations: 

intSiackOpsDecl: {stk: type -►type) ^ 

(empty: stk X 
isEmpty: (5/A-*booI) X 
push: (tni X stk-^stk) X 
top: (stk-^'mx) X . . .) 

Matters are somewhat complicated by the fact that the abstraction may itself be parameterized. 
We would probably prefer a stack abstraction, for example, that is not committed to the type of 
value being stacked. This gives us still more choices about how to arrange things. To illustrate 
some of the possibilities, we give definitions for the smallest reasonable pieces of a stack 
abstraction, and show various ways of putting them together. 

We begin with a function producing a declaration for the stack operations; it has both the ele- 
ment type elem and the stack type stk as parameters: 

StackOpsDecl: (elem: type X stk: type type) IS 

(empty: stk X 
isEmpty: (5/^-*booI) X 
push: (elem X stk-^stk) X 
top: (stk-^elem) X ...) 
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With this we can write the previous definition of intSiackOpsDecl more concisely as 

intStackOpsDecl: (stk\ type ->typc) IS 

StackOpsDecli'ml stk) 

The type of a conventional stack abstraction, parameterized by the element type, is a function 
that produces a declaration for a dependent type; 

StackDecl: (dem: typc-^type) IS stk: type XX 

SiackOpsDec^elem, stk\ 

and we can write the previous iniSiackDecl as 

iniStackDecl: type ^ StackDecl int 

S^tfLcWl)g><^l Leaving the element type unbound, we can write an implementation of S - tackOf using lists to 
represent stacks. 

StackFromLisv. {el: iypQ^> StackDecl et) IS 

{stk:- list el\ 
empty'.'^mV, 

isEmpty: {s: stk-^hooX) IS j=nil; 
... ) 

WHERE list: type-^type-. . . 

Here we have given the type of list but omitted the implementation, which is likely to be primi- 
tive. 

By analogy with list, if we have only one implementation of stacks to deal with we will 
probably just call it stack, rather than StackFromList. In particular, an ordinary^ client is 
probably in this position, and will be written 

Client: (stack: (el: iype^> StackDecl ef) -►...) IS LET intS tack: -stack int IN 

-- Client body "... 

This arrangement for the implementation leaves something to be desired in security. The client 
body is type-checked without any knowledge of the list implementation, and hence cannot com- 
promise its security. However, the enclosing program which includes both looks like 

LET StackFromList:^ "d& above- . . Ciienc as above- ... IN 

. . . C!ient(StackFromList) . . . 

and this program is in a position to construct a list int and pass it off as a intStackSstk, To 
defend itself against such forgeries, an implementation such as StackFromList may need a way 
to protect the ability to construct a stk value. To this end we introduce the primitive 

AbstractType: ( T: type X p: Password 

AT: type XX abs: (T^AT) X rep: (AT-^T) )-...; 

This function returns a new type AT, together with functions abs and rep which map back and 
forth between y4rand the parameter type T. Values of type AT can only be constructed by the 
abs function returned by a call of AbstractType with the same Password, 

Other languages with a similar protection mechanism (for example, ML) do not use a password, 
but instead make AbstractType non-applicative, so that it returns a different AT each time it is 
called. This ensures that no intruder can invoke AbstractType on his own and get hold of the abs 
function. We have not used this approach for two reasons. First, a non-applicative AbstractType 
does not fit easily into the foiinal operational semantics for Pebble. Both the intuitive notion of 



type-checking described in § 2 and the formal one in § 5 depend on the fact that identical ex- 
pressions in the same environment have the same value, i.e., that all functions are applicative. 
The use of a password to make an abstract type unique is quite compatible with this approach. 

Second, we think of converting a \ alue v to an abstract value abs{v) as a way of asserting some 
invariant that involves v. The implementations of operations on ab^iv) depend on this invariant 
for their correctness. The implementer is responsible for ensuring that the invariant does in 
fact hold for any v in an expression abs{v)\ he does this by: 

checking that each application of abs in his code satisfies a suitable pre-condition; 

preventing any use ofabs outside his code, so that every application is checked. 

A natural way to identify the implementer is by his knowledge of a suitable password. This re- 
quires no extensions to the language, and the only assumption it requires about the program- 
ming system is that other programmers do not have access to the text of the implementation, 
but only to the interface. We warn this to be true anyway. 

Using AbstractType we can write a secure implementation: 

StackFromList: (el: iypQ)^> Slack Decl el IS 

LET {st:^ a. AT, abs\^a.abs\ rep'.^a,rep 

WHERE a:^AbstractTyp€{list el 314159)) IN 

{stk'.^st\ 

procs: " (empty:'* abs nil ; 

isEmpiy: (s: sr^-^bool) IS rep 5= nil; 

...)) 

Here we are also showing how to rename the values produced by AbstractType\ if the names 
provided by its declaration are satisfactory, we could simply write 

SiackFromList: (el\ iype)-^> StackD eel el \S LET AbstraciTypeilist el, 314159)) IN 

(stk:'-AT\ 

procs: ^ (empty:''abs nil; 

isEmpiy: (s: stk-*\yoo\) IS rep j= nil; 
...)) 

The abs and rep functions are not returned from this StackFromList, and because of the 
password, there is no way to make a type equal to the AT which is returned. Hence the 
program outside the implementation has no way to forge or inspect .47 values. 

Sometimes it is convenient to include the element type in the abstraction: 

oStackDect. type - dm: type XX 

sik: type XX 
SiackOpsDecl[elem, stk] 

This allows generic stack-bashing functions to be written more neatly. An aStackDecl value is a 
binding. For example, redefining intStack, 

iniSiack: oStackDech (e/em.'-'int, SiackFromList inl) 



An example of a generic function is 

Reverse: {S: aStackDeclX X x\ S%sik^>S%sik) IS LPT 5 IN 

LHT rev: (y^: stk X i: stk~^stk) IS 

II- isEmpty y JHl'.S z ELSE rexipopy, push(lopy z)) 
IN rey{x, empty) 

SO that ReverseiintStack, intStack%Mak€Stack[\, 2, Z])^ intSiack%MakeStack[l>, 2, 1] 
3.3 Generic types 

A generic type glues a value to an instance of an abstract data type. Thus, for example, we 
might want a generic type called atom, such thai each value carries with it a procedure for print- 
ing it. A typical atom value might be: 

[ string^ Print-- string%PrinU "Hello" ] 

A simple way to get this effect (using <> for string concatenation) is 

AtomOps: t\ type-*lype is Print: (t-^ list char) 
atomT: type- t: type XX 

AtomOp^t) 

atom: type- at: atomT XX 

val: at%t 

PrintAtom: (a: atom-^list char) IS a%Print{a%vat) 
REC Print List: (I: list atom-*' list char) IS 

If null i THEN "[]" 

ELSE <> PrintAtom (hd l) 

<> "," 

O PrintLisi(tl f) 
<>"]" 

With this we can write 

stringAtomT: - AtomT ^[sinng, Print^PrintString] ; 

hello: - Atom^[stringAtomT ''Hello''] 

iniAtomT: - AtomT-^lml Print-- Print I nt] \ 

three:^ Aiom^-lnumAiomT 31 

Then PrintAtom three^"y\ and PrintList\hello, three, nil] = "[Hello,[3,D]]". 

This is fine for dealing with an indi\'idual value which can be turned into an atom, but suppose 
we want to print a list of ints. it isn't attractive to first construct a list of atoms; we would like 
to do this on the fiy. This observation leads to different Print functions, using the same defini- 
tion of atom. The idea is to package a type /, and a function for turning /'s into a/om's. 

atomX:^ t: type XX conv: t-*atom 

PrintAtom: (at: atomX XX v: aiSt-^list char) IS 

LEI a:^at$conv v IN aSPrint{aSva[) 
REC PrinlList: (at: atomX XX I: list atSi^list char) IS 

IF /7w/// THEN "[]" 

ELSE "(" <> PrintAtom[a/, hd /] 
<> "," 

<> PrintList[at, tl l\ 
<>"]•• 

IntAsAtom: atomX^ ( r:-int, 

conv: (v: t^atom) IS f:-int, Print:^PrintInt, val.^v) 



3 J Union types 

There is a straightforward way to define a union type in Pebble: 

T^®T^-iag: bool XX val: (IF IHIIN FiLSI- 

Unfortunately, it doesn't work, because the inference rules cannot evaluate the IF based on the 
rest of the program. For example, the expression 

lET x\ r^®r2-...iN 

w ill not type-check. There is no way to decompose a value of this type, and hence it is useless. 

However, there are other ways to introduce unions. We have not completed a design, but here 
is a suggestive sketch. Following Cardelli [1984], we introduce a union or sum type parallel to 
the product type we already have. If and d2 are declarations, then parallel to the labelled 
product d{X.d2 there is a labelled sum ^^©^2- expression with type d^ has type rfj©^^. and 
so does an expression with type For example, if D\-^{i\ int ® r: real), tlien both and 
r:~3.14 have type D. 

Parallel to 

LET B IN £ 

which decomposes a labelled product, is 

CASE E OF B 

which decomposes a labelled sum. If the type of £ is © rij, the B in CASE must have 
type 

In other words, B provides a suitable function for each case of £. The value of the CASE is ob- 
tained by chosing the proper function and applying it to rhs £. More precisely, in view of its 
type the value of E must be n^^^i or n2^e2^ In the former case, the value of the CASE is 

in the latter it is BSnjie^), To continue the previous example, after 
rnT/:-\ x: (/: int® r: real) in 

CASE X OF [ 

/i-X j: int-* int fNy>l , 
r.^Xs: reaWint IN rix(5)] 

ifrt) has the value 4, and so does /(/•:--4.1416). 
3.5 Recursive types 

Pebble handles recursive functions in the standard operational style, relying on the fact that a 
X-expression evaluates to a closure in which evaluation of the body is deferred. The language 
has types which involve closures, namely the dependent types constructed with and XX, 
and it turns out that the operational semantics can handle recursive type definitions involving 
these constructors. A simple example is 

LET RFC IntList: iy^Q^head: int XX tail: (/: JniList ® v: void) 

where for simplicity we have confined ourselves to lists of integers rather than introducing a 
type parameter. Although the evaluation rules for recursion were not designed to handle this 
kind of expression, they in fact do so quite well. 



It is also interesting to note that union types are not necessary for meaningful recursive types, 
although they are very convenient. In fact, we can represent a list of integers as a function 
which takes no argument and returns an integer and another such function, in Pebble 

I FT RCC IntList: typC'-(void->>int X IntList): 
lU'C empty. IntUst-^iX IntList IN [error, empty]) IN 

Now we can define a list of the first n integers: 

LCT Rrc /: mt-^ IntList IS 

A IntList }\ 11 0 iill N [0. empty] ELSn [/,/(/- 1)] 

The usefulness of this definition may well be questioned, but it does show that recursive types 
are a purely functional phenomenon. 

3.6 Assignment 

Although Pebble as we have presented it is entirely applicative, it is easy to introduce impera- 
tive primitives. For example, we can add 

var: type-* type 

Then var int is the type of a variable whose contents is an int. We also need 

new; r. typc-*>var TX 

Make Assign: T: type-*>(var TXT-* void) X 

AfakeDereference: T: type-*>(var T-* 7) 

From MakeAssign and MakeDereference we can construct : = and t procedures for any type. 

Of course, these are only declarations, and the implementation will necessarily be by primi- 
tives. Furthermore, the semantics must be modified to carry around a store which := and t 
can use to communicate. 

In addition, steps must be taken to preserve the soundness of the type-checking in the presence 
of these non-applicaiive functions. The simplest way to do this is to divide the function types 
into pure or applicative versus impure or imperative ones. MakeAssign and MakeDereference 
return impure functions, as does any function defined by a X-expression whose body contains 
an application of an impure function. Then an impure symbolic value is one that contains an 
application of an impure function. We can never infer that such a value is equal to any other 
value, even one with an identical form (at least not without a much more powerful reasoning 
system than the one in the Pebble formal semantics). 



4. Values and syntax 



This section gives a formal description of the values and syntax of Pebble. It also defines a rela- 
tion 'has type' (written :::) between values and types; in other words, it specifies the set of 
values corresponding to each type. Note that these sets are not disjoint. § 5 gives a formal 
description of the semantics of Pebble, and defines a relation 'has type' (written ::) between 
expressions and types. 

Values 

We start our description of Pebble with a definition of the space of values. These may be parti- 
tioned into subsets, such as function values, pairs and types. Some of these may be further par- 
titioned into more refined subsets, such as cross types and arrow^ types. Our values are the kind 
of values which would be handled by a compiler or an interpreter, rather than the ones which 
would be used in giving a traditional denotational semantics for our language. The main dif- 
ference is that we represent functions by closures instead of by the partial functions and 
functionals of denotational semantics. Table 1 gives a complete breakdown of the set of values. 

e viz true, false, 0, 1, 2, etc. 

/ primitive(w)— where wis + , X, etc, 
cl osure(p, d, E) 

nil 

e, e 

b n^e 
n\\ 
6, b 

t viz bool, int, etc. 

void 
tXt 

d>f 

d n: t 
void 
dXd 
di^f 

ffoe 

Table 1: Values 

Each set of values, denoted by a lower case letter, is composed of the sets written immediately 
to the right of it, e.g. 

e = eo U/ U nil U (e,e) U 6 U / U (w!e) U iffoe) 

where by (e, e) we mean the set of all values (v^, such that V|€e and v^^e. Similarly nil 
means {nil}, (tv!e) means {(v. !v2) | Vj€>v, V2€e} and so on for each value constructing operator. 



Tlie primitive constants of the value space are written in this font. Constructors such as 
closure arc written in this font. Mcta-variables which denote values or sets of values, 
possible of a given kind, are single lower-case letters in this font, possibly subscripted. 

We now examine each kind of value in turn, giving a brief informal explanation. Indented 
paragraphs describe how a set of values may be partitioned into disjoint subsets. 

e is the set of all values, everything which may be denoted by an expression. 

eQ consists of the primitive values true, false, 0, 1, all except the functions and types. 

/consists of the values which are functions, as follows: 

The values pr imi ti ve(HO. where n- is some built-in function such as addition or mul- 
tiplication of integers. They include functions on types such as X. We write 
primiti ve(w) rather than just »v to show that >v is tagged as a primitive function. 
This is useful for matching purposes in our operational semantics; see § 5. 

closure values, the results of evaluating A-expressions. A closure is composed of: 
an environment p, which associates a type and a value with each name; 
a declaration value, which gives the bound variables of the X-expression; 
a body expression, which is the expression (expressions are defined in § 4.2). 
nil, the 0- tuple. 

[e, e], the 2-tuples (ordered pairs) of values. The pair forming operation is In general 
we use brackets for pairs, as in [1, [2, [3, nil]]]; formally, brackets are just a syntactic var- 
iant of parentheses. Since associates to the right, we can also write [1 , 2, 3, nil]. 

binding values, which associate names with values. For example evaluating LET x\ 
int--l + 2 IN ... will produce a binding x-'3 which associates x with 3. Strictly we should 
discriminate between "binding expressions" and "binding values", but mostly we will be 
sloppy and say "binding" for either. Bindings are either elementary or tuples, thus: 

A/-^, which binds a single name /V to a value 

nil. The 0-tuple is also a binding. 

[6, by which is a pair of bindings, is also a binding. The binding [b^, b^ binds the vari- 
ables of 6j and those of This is a special case of [e, e] above, since 6 is a subset of e. 

fix values, which result from the evaluation of recursive bindings. A fix value con- 
tains the declaration of the names being recursively defined and the function which 
v\ represents one step of the recursive d efintition (roughly, the functional whose fixed 

point is being computed). Details are given in § 5.2.5. 

type values, consisting of: 

/q, some built-in types such as booleans (bool) and integers (int). They include the 
type type which is the type of all type expressions, 
void, the type of nil. 

/X/, which is the type of pairs. If expression has type t^ and expression has type 
^2, then the pair E^ has type ^iX/j. 
which is the type of functions. 



d, declarations. These are the type of bindings; for example, the type of x\ int-'l + 2 is 
x: int. They give types for the three kinds of bindings above. 

A/:/, a basic declaration, which associates name N with type /, e.g. x: int. 

void, the type of the nil binding. 

dXsd, the type of a pair of bindings (a special case of /X/). 
di^f, a dependent version of JX/. This is explained in § 2.5. 
^/►/ a dependent version of This is also explained in § 2.5. 

w!^, the application of the primitive function w to the value e. Such applications are values 
which may be simplified. 

/%e, the application of the symbolic value /to the value e. This is explained in § 2.6. 

We may define a relation :;: between values and types, analogous to the :: ('has type') relation 
between expressions and types defined in §5. Unlike the latter, it is independent of any 
environment. We could define it by operational semantic rules, but it is shorter to give the fol- 
lowing informal inductive definition. In one or two places we need the ('has value') relation 
between expressions and values defined in § 5. We first define a subsidiary relation 'is the type 
part of between type values and declaration values; for example, int is the type part of x: int 
and intXbool is the type part of x: intXp: bool. 

/ is the type part of N: t 

void is the type part of void 

/1X/2 is the type part of ^/jX^/j if ^1 is the type part of d^ and ti is the type part of 

Now for the definition of ::: 

• true ::: bool, false ::: bool, 0 ::: int, 1 :;: int, and so on. 

primiti ve(not) ::: bool-^bool, and so on for other operators. 

primi tive(X) ::: typeXtype-^type, 
pr imi ti ve(->) ::: typeXtype-^type. 

closure(p,£i,£} h~^h h ^^P^ part of and for all bindings 6 such that b \\\d 
we have p{d^b] E:: 

• nil ;:: void. 

ej, €2 ^1X^2 ^^^1 '1 ^2 

• N^e ::: N: t xfe ;:: /. 
fUidJ):::d\ff:::d^d, 

• bool ::; type, int ::: type, type ::: type, void ::: type. 
/1X/2 type if ::: type and ::: type. 

• A/: / ::: type. 

c/*/ : : : type iff:::d^ type. 
dkf::: type if/: : : d-^type. 



• vv!e ::: /2 if e ::: and primi ti ve(vv) ::: ti~^t2' 
f\e::: tj 'ife::: and/::: ^1-^/2^ 

• fj, 62 ::: dirf \f supposing that I— (//.d-type)(e^*d)=>/2, then ej, ej ::: dXt2, 

f{/,\ d^/if for all esuch that e ::: d, \iY-{fu0^x,o.){€ua)^t2 then (/j^.^type)(e/^cj) ::: t2^ 

The last clause might be written more simply by defming a notion of application for values, say 
/ analogous to F £ for expressions, and writing 

• ej, 62 ::: J*/if e^, ::: dX(fd). 

f{.\: d>f \f for all e such that e ::: d,/^ e ::: / e. 

Now if £==>e, we would like to have e ::: / if and only if £* :: /. But our type checking rules, 
which use symbolic evaluation, cannot always achieve this. A closure may have a certain type 
for all bindings, but symbolic evaluation may fail to show this. Consider for example 

X x: ini (IF x<x+l niEN int HLSE bool) IN x :: int-^-int 

This is not derivable from our typechecking rules because symbolic evaluation cannot show 
that x<x-\-l for an arbitrary integer x. But the latter is true, so if /is the value of the lambda 
expression we do get/::: int-^int by the definition above for closures. This limitation does not 
seem to present a major practical obstacle, but the matter would repay further study. 

4.2 Sy max 

We can give the syntax of Pebble in traditional BNF form, but there will be only three syntax 
classes: name (/V), number (/) and expression (E). 

N::- letter (letter | digit)* 

/ ::= digit digit* 

£ :: = bool | int | void | £X£ | £-*^£ | £-^^£ 1 7V:£ | £XX£ | type | true | false | / 1 
nil I £, £ I X £ IN £ I £-£ | REC £-£ 1 £; £ | A^:~£ | TV | IMPORT IN £ | 
IF £ THEN £ ELSE £ | £ £ | LET £ IN £ | typeOf £ | (£) | [£] 

It is more helpful to divide the expressions up according to the type of value they produce. We 
distinguish subsets of the set £ of all expressions thus: T for types, D for declarations, B for 
bindings and £ for functions. These cannot be distinguished syntactically since an 
operator/operand expression of the form £ £ could denote any of these, as could a name used 
as a variable. However it makes more sense if we write, for example, LET 5 IN £ instead of 
LET £ IN £, showing that LET requires an expression whose value is a binding. 

It is also helpful to organise the syntax according to types and to the introduction and elimina- 
tion rules for expressions of each type. This is a common format in recent work on logic. For 
example a value of type 7^X72 is introduced by an expression of the form Tj, 72, it is 
eliminated by expressions of the form fst £ or snd £. 

The syntax presented in this way is shown in Table 2; a list of the notations used is given in 
Table 3. Table 4 shows some abbreviations which make Pebble more readable, for example 
eliminating the X notation for function definitions. 
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Type 

bool 
int 

void 



'0 



T- 

D ,V: 7 
D^XD2 
D^XXD2 



type 



Introduction 

true false 
0 1 2 ... 

nil 

E,,E2 
F X TIN £ 
primitives 

B Z) ~ £ 
REC Z) ~ £■ 

N:~E 

all types in the 
left column 

N 



Elimination 

IF£ THFN £1 HI.SKfj 

fst E snd E 
FE 

LET IN £■ 



typeOfZ) 
IMPORT ;V IN £ 



Either round or square brackets may be used for grouping. 



The- 



It It It . It 



■ X and XXopcrators associate to the right, others to the left : 



Precedence is: lowest IN, then then ~, then -►>, then X XX, highest 

Table 2: Syntax 

Non-terminal Must evaluate to Example 

i 



All 

associcx+e. 

+0 AUc \r\d!^\. 



N 
E 
T 
D 
B 
F 



name 
expression 
type 

declaration 

binding 

function 



gcd(/, 3)+l 
int 
/: int 
/:int~3 

\ i: int-*bool IN i>3 



All the non-terminals except A'^ are syntactically equivalent to £. 

Table 3: Summary of abbreviations 



Write For 

N:TlSE N:~\TWE 

[/»!, P2] .~ E : ~ fst £, /"j :~ snd £ 



Example 

f : (/: int X jc: real)-* real IS jc' 
[ij]:~QuotRem{l, 2) 
= /:~3.y:~l 



B%N 



LET B' :~ B IN IMPORT B' x:~'it]$x 

= IT 



IN LET B' IN N 
E WHERE B LIT 5 IN £ 



/+ 4 WHERE i:~3 



Table 4: Sugar 



Type 



bool 
int 



Introduction 



(ao)taie :: bool=>true (bo) false :: bool=> false (cO)0:: int==^0 



void 

T1XT2 



(1) 



(ao)nil :: void=>nil (^[Fp :: tjXt2=>[ej, 



D-*>T 

= parameter decl 
Ui^ parameter type 
€ii = argument value 
t - result type 



'ttyp 
tintype of K-exp 



=>/^j , tj^::::^/-*/ , typeOf d/^type =>/q , tQ-*t== 
(2) or Tj =>rj , /, f#d^type(newc#d) =>/} , 

(3) 

(4) LET newc#d IN E :: t 

where newc is a wew constant 



(0) 



(X IN £) :: t^=>ciosure(p, d, E) 



N:T 

DjXXDj 



(1) D=i^^/, d:^void , E:;void, nil = 

(2) orD=>^,d;=::A/: E::t, (N-e) = 

(3) or D=>i/, d~^/^X (d^#type'-fst E,d2#type-'snd E]::d==>6 

(4) 0r D=> Jq, dQ~Jj* /, f#di^type(dj#type'^fSt E) =^t/2 » 

(5) A^d^ — d, d#type'^E::d=^Z> 



(0) 



(ai)D (\ F':D-^D IN LET F' in d#type-E)=>/ 

(a2) f;yd->d(f 1x(f, d)/yd)=>b 

(ao) REC /)-£":: d=>b 

(bi) [Bi , LET Bi IN Bi] 5 (ci) 



E:: / 



(bO) 



(CO) N :- E :: (N: t)=i>N~e 



type 



(ai) X B': D-»type in let B' in T=»/ 
(ao) /)->>r::type=>ci^f 
(bi) X B': D-»type in let B' in T^/ 
(bO) /) XX r;: type=>d*f 
(ci) T :: type 
(co) A^: T:: type=>N: t 



Names 



(1) pCN):::^/-'^^ 

(2) {Cf,'^>^clsc c^ = e} 



(0) 



Table 5: Inference rules 



Elimination 



{i)E :: bool , :: /, ^2 

(2){H==>true , F^=>eor H=:> false , \l^=>e} 



(0) IF /^"niHN h\ FLsn E^wi^ 



Type 



bool 
int 



(ao) fst :: (/X/^)-^t <bo)snd :: (/^XO-^t 



void 



F .E,::to 

(2) or F :: , f^#d-type(d#type~EQ) } , 

(3) { f^primitive(w) , { wICq'^^ cIsC w!e = e } 

(4) or f^ciosure(pQ,d,/i:'),d#type'-EQ=^Z), p^l— I.ET b#d IN E =>^ 

(5) else ?n case f is a symbolic value f%eQ ~e} 



(0) 



d ~ parameter decl 
ta = parameter type 
e^, - argument value 
t - result type 
ti = type oj\-exp 



(1) B:: void , 

(2) or B:: (A/: /A rhs B =^6^. 



E:: /=>e 



(3) or B:: d^Xd^, snd B ==>i2,LnTfst B in LET bj^a, IN E :: /=»e 

(4) or B:: /, f#di-type(fst B) ==>i/2 » b^d^xd^ in E :: t=>e 

(5) 



(0) 



LET^iNf*:: t=>e 



N:T 
DjXD^ 

DjXXD^ 



(bo)rhs :: (A/: t)^i 



these two are for convenience only 



(1) {devoid , void = / 

(2) or d~A/ : / 

0)0Y d'^d^Xd^y typeOfd^^type X typeOfd^^type =>/ 
(A)ord'^d^^f, typeOfld - /} , 
(5)D :: type 



<0) 



typcOf D :: type=>t 



type 



(1) [N: p(N)]l-E:: (=^e 
(0) IMPORT iV IN E \: v-=>c 



Names 



Table 5: Inference rules (continued) 
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Auxiliary rules 



(1) i^d^^ /, f#d,-t/pe (fst B) ==>/2 , E :: d^Xt^ 

(2) or t^typeOf !d , d//type-E : : d 
(0) E:: t 



(0) eff t :: t==>e 

for each <arg, resulO 
^ pair in the primitive w 

(0) w\e^^Qi (ao) fst![e, e^l^^e (bo) snd![^^j, ^I'^^e (co) rhs!(A/'-e)T^e 

(1) CQ^vvIe^, e^'^>e2 , w!e2'^e 

(2) or e^^f ix(/^ d) , f#d-d(eQ#d)=>e 

(0) e^'^>^ 



Notation 



e#i is an expression with value e and type t. 
CQ^i^e {simplify'), eQ'^>e (unroll) define functions on values. 
Italics: meta-variables bound by deterministic evaluator. 
Value constants: type; value constnictors: closure. 



name 

expression 
r=E with t value 
with d value 
5==E withb value 
F= E with f value 



e= value 
/=type value 
d= declaration value 
Z?= binding value 
/^funcdon value 



gcd(i, 3)+l 
int 
/: int 
/: int^3 

A /: int^bool in />3 



Table 5: Inference rules (concluded) 



5. Operational semantics 



We have a precise operational semantics for Pebble, in the form of the set of inference rules in 
Table 5. This section gives the notation for the inference rules, explains why they yield at most 
one value for an expression, and discusses the way in which values can be converted into ex- 
pressions and fed back through the inference system. Then we explain how each rule works, 
and finally show how to derive an efficient type-checker and evaluator from the rules. 

5,1 Inference rule semantics 

The basic idea, which we derive from Plotkin, is to specify an operational semantics by means 
of a set of inference rules. The operations of evaluation are the steps in a proof that uses the 
rules. The advantage of this approach is that the control mechanism of the evaluator does not 
need to be written down, since it is implicit in the well-known algorithm for deriving a proof. 
Indeed, our rules can be tnvially translated into Prolog, and then can be run to give a working 
evaluator. We have in fact carried out this translation in part. 

In general, of course, this will lead to a non-deterministic and inefficient evaluator; the par- 
ticular rules we use, however, allow an efficient deterministic evaluator to be easily derived. 

5././ Notation 

Each rule has a set of premises assertion^ assertion^ and a conclusion assertion^, written thus: 

assertion^, assertion^ 
assertion^ 

As usual, the meaning is that if each of the premises is established, then the conclusion is also 
established. We write 

assertion^x. assertion^^^ or assertionj^, assertionj^^ 

assertion^ 

as an abbreviation for the two rules 

assertion^Y^ assertioni^^ assertionii as5eriion2^^ 

assertion^ assertion^ 
Note that or has lower precedence than Sometimes or is more deeply nested, in which case 
the meaning is to convert the premises to disjunctive normal form, and then apply this expan- 
sion. 

An assertion is: 

environment t— simple assertion 

An environment is a function mapping a name to a type and a value. The environment for the 
conclusion is always denoted by p, and is not written explicitly. If the environment for a 
premise is also p (as it neariy always is), it is also omitted. 

A simple assertion is one of: 

1) Ewt asserts that £ has type / in the given environment 



2) E=^e asserts that E has value e in the given environment. 

3) e^format asserts that e is of the form given by format. For example, e^^t^-^t^, here 

t^-^ti is a format, with variables /j and ti- If e is int->bool, this assertion suc- 
ceeds with = int and /2 = bool. 

There are three forms of simple assertion which are convenient abbreviations: 

4) £:: t^e combines (1) and (2) 

5) E\\ format combines (1) and (3); it is short for £:: /, t":^ format, 

6) ^1 = 62 asserts that is equal to e-^. this is a special case of (3). 

Finally, there are two forms of simple assertion which correspond to introducing auxiliary func- 
tions into the evaluator: 

7) ^{^e^ asserts that simplifies to using the simplification rules which tell how to 

evaluate primitives. See § 5.2.2. 

8) e{^>e2 asserts that unrolls to ej, using the rule for unrolling f i x. See § 5.2.5. 

By convention we write a lower-case e for the value of the expression £, and likew ise for any 
other capital letter that stands for an expression. If a lower-case letter x appears in an assertion 
but no premise is given to bind it, then the premise 

is implied. 

A reminder of our typographic conventions: 

We use capital letters for meta-variables denoting expressions, and lower-case letters for 
meta-variables denoting values; both may be subscripted. Thus expressions appear on the 
left of :: and ^ in assertions, and values everywhere else. 

Value constants are written this way: e.g., true, x: int. 

The value constructors that are not symbols are primitive, closure and fix. 

An italicized meta-variable indicates where that variable will be bound by a deterministic 
evaluator, as explained in the next section. 

5.7.2 Determinism 

In order to find the type of an expression £, we try to prove £:: /, where / is a new meta- 
variable. If a proof is possible, it yields a value for t as well. Similarly, we can use the inference 
rules to find the value of E by try^ing to prove E=>e, We would like to be sure that an expres- 
sion has only one value (i.e., that E^e^ and E^ej implies ^1 = ^2)- ^^^^ guaranteed by the 
fact that the inference Riles for evaluation are deterministic: at most one rule can be applied to 
evaluate any expression, because there is only one conclusion for each syntactic form. When 
there are multiple rules abbreviated with or, the first premise of each rule excludes all the 
others. In a few places we write 

as an abbreviation for 



not fljp not flji "<>t ^kv ^2 

The fact that the rules are detenninistic is important for another reason: they define a reason- 
ably efficient deterministic program for evaluating expressions. We will have more to say about 
this in § 5.4. 

It is not true, however, that an expression has only one type. In particular, the auxiliary rule :: 
may allow types to be inferred for an expression in addition to the one which is computed, 
along with its value, by all the other rules. We will say more about what this means for deter- 
ministic evaluation in §5.2.6. 

In each rule one occurrence of each meta-variable is italicized. This is the one which the deter- 
ministic evaluator will use to bind the meta-variable. For example, in XII, and are bound 
to the types of £| and £"2 respectively; they are used in XIO to compute tiXt2, the type of [fj. 
£2]. The italic occurrence of e may be omitted if it is E=>e, as explained earlier. Thus the e^ 
and e2 in XIO are bound by omitted premises Ej^e^ and ^^ej. The italics are not part of 
the inference rules, but are just a comment which is relevant for deterministic evaluation, and 
may be a help to the reader as well. 

It may also be helpful to know that the premises are written in the order that a deterministic 
evaluator would use. In particular, each meta-variable is bound before it is used. In this order- 
ing, the expression in the conclusion should be read first, then the premises, and then the rest 
of the conclusion. 

5 J, 3 Feedback 

An important device for keeping the inference rules compact is that a value with a known type 
can be converted into an expression, which can then be embedded in a more complex expres- 
sion whose type and value can be inferred using the entire set of rules. This feedback from the 
value space to the expression space is enabled by the syntax 

This is an expression which has value e and type /. This form of expression is not part of the 
language, but is purely internal to the inference rules. Usually the type is not interesting, al- 
though it must be there for the feedback to be possible, so we write such an expression with 
the type in a small font; 

to make it easier for the reader to concentrate on the values. In the text of the paper, we often 
drop the #/ entirely, where no confusion is possible. 

5.2 The rules 

The inference rules are organized like the syntax in Table 2, according to the expression forms 
for introducing and eliminating values of a particular type. A particular rule is named by the 
constructor for the type, followed by I for introduction or E for elimination; thus ->I is the 
rule for X-expressions, which introduce function values with types of the form 1^-^(2^ Each line 
is numbered at the left, so that, for example, the conclusion of the rule for X-expressions can 



be named by -►lO. If there is more than one rule in a part of the table labeled by the same 
name, the less important ones are distinguished by letters a, b, ...; thus :Ia is the rule for RHC. 
Auxiliary rules, with conclusions which are not pait of the syntax, appear overleaf. 

5.2/ Booleans, pairs and names 

The inference rules for booleans are extremely simple, 
booll 

(au)true :: bool=^true (bO) false :: bool=» false 

boolE (i)E:: bool , :: / . :: t, 

(2){ E^true , Ei=>eor E=>false , E2=>e} 

(0) IF £ THEN £i HLSE :: t=$^e 

booll tells us that the expressions true and false both have type bool and evaluate to true and 
false respectively; these rules have no premises, since the conclusions are always true. boolE 
says that the expression 

IF £ THEN ELSE £2 

typechecks and has type t if E has type bool, and E^ and Ej both have type / for some The 
value of the IF is the value of E^ if the value of E is true, the value of £2 if the value of £ is 
false. Thus 

(A) IF true THEN 3 ELSE 5 
has type int and value 3, 

We can display this argument more formally as an upside-down proof, in which each step is ex- 
plicitly justified by some combination of already justified steps and inference rules (together 
with some meta-rules which are not mentioned explicitly, such as substitution of equals for 
equals). 



(Al) IF true THEN 3 ELSE 5 :: int=>3 2, 3, 4, boolE 

(A2) true :: bool=^true boolla 

(A3) 3::int=>3 indc 

(A4) 5:: int intic 



In this display we show the conclusion at the top, and successively less difficult propositions 
below it. Viewing the inference rules as a (deterministic) evaluation mechanism, each line 
shows the evaluation of an expression from the values of its subexpressions, which are calcu- 
lated on later lines. Control flows down the table as the interpreter is called recursively to 
evaluate sub-expressions, and then back up as the recursive calls return results that are used to 
compute the values of larger expressions. 



^riie rules for pairs are equally simple. 

XI (1) E| :: ri,E2:: ^2 



(ao)nil :: void=^^nil {0)[E^, £"2] ^2! 

XE 



(aO) 



fst :: {iXt^)-^i msnd :: (/^XO-^t 



XIa says that nil has type void and value nil. XI says that the type of [f^, E2] is iiXtj if is 
the type of f^, and its value is [ej, ^2]- S'^'^^ (highly polymorphic) types of the primi- 
tives fst and snd that decompose pairs. 

The rules for names are also straightforward, except for NI2, which is treated in § 5.2.5 since it 
is needed only for recursion. 

NI (1) p(N)^/-e 

(0) A^:: t=>e 

NE (i)[N:p(N)]l-E::/^e 

(0) IMPORT //IN £:: t=>e 

We can use NI to show 

[/: int~3]h-lF true THEN / ELSE 0 :: int=>3 
following the proof of (A) above, but replacing (A3) with 
(A3') [/: int~3]l-/:: int=>3 NI 

The IMPORT construct has a very simple rule, NE, which says that to evaluate IMPORT N IN £, 
evaluate E in an environment which contains only the current binding of N. 

5.2.2 Functions 

The pivotal inference rules are (for defining a function by a X-expression) and (for ap- 
plying a function). The rule is concerned almost entirely with type-checking. If the type 
checks succeed, it returns a closure which contains the current environment p, the declaration d 
for the parameters, and the unevahiated expression E which is the body of the X-expression. A 
later application of this closure to an argument £q is evaluated (using -^E) by evaluating the 
expression 

(1) LET d-^E^ IN E 

in the environment p which was saved in the closure. 

We begin with the basic rule for A, omitting line 2, which deals with dependent function types: 

-►I (1) Tj =>/ii , tii~i/->/ , typeOf d;^^pe => , tQ-*t= 

parameter dec! j newc^d IN E :: t H'/7ere newc is a new constant 

£0 = parameter type 

'«) (xriiN£)::ti=>closure(p.d,E) 

t\-iyp€ ojfk-exp 
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The notes on the left of explain the meaning of the meta-variables. The expression in the A 
roughly gives the type of the entire A-expression. Thus 

(B) A /: int->int IN /+1 

has T^ — {i: int-»^int), and its type (called /|) is int->int. The value of is called z^; it differs 
from in that the declaration /: int has been reduced to its type int. This is done by (-^11), 
which accepts a which evaluates to something of the form d-*i, and computes first /q as 
typeOf (using typeE), and then as /q-^/. 

The idea of (-^14) is that if we can show that (1) type-checks without any knowledge of the ar- 
gument values, depending only on their types, then whenever the closure is applied to an ex- 
pression with type the resulting (1) will surely type-check. This is the essence of static type- 
checking: the definition of a function can be checked independently of any application, and 
then only the argument type need be checked on each application. 

("*I4) is true if we can show that 

(2) LET nev^c#d in E 

has the result type /, where newc is a new constant, about which we know nothing except that 
its type is d. In other words, newc is a binding for the names in d, in wbeh each name has the 
type assigned to it by d. For our example (B), we have 

(3) LET newel*, .m in /+1 

which must have type int. To show this, we need the base case of :E, the rule for LET. 
:E (3) B:: (N: /q), rhs B , pINito-eJ h-E :: /=>e 

(0) let5in£:: t=>e 

Using this, (3) has type int if 

p[i: int-rhs!newc1]l— /-hi 

has type int. Since /+1 is sugar for plus[/, 1], its type is given by the result type of plus (accord- 
ing to ^El), provided that [/, 1] has the argument type of plus. Since 

plus :: intXint->int 

we have the desired result if [/, 1] :: intXint. Using XI this is true if / :: int and 1 :: int. Tlie lat- 
ter is immediate, since 1 is a primitive. According to NE, the former is true if p{i)~\n{'^ €q. But 
in fact pCo^int-^rhslnewd, so we are done. 



We can write this argument more formally as follows: 



(Bl) 


pHLET newel *, .m in /+1 :: int 


2. :E 


(B2) 


Pjl— 1+1 :: int 


3. -E 




where Pi = p[/: int~rhs!newc1] 




(B3) 


Pjh-plus :: /-*int, [/, 1] :: / 


4.5 


(B4) 


Pjh— plus :: intXint-^int 


primitive 


(B5) 


PiH[/, 1]:: intXint 


5, XE 


(B6) 


Pjl— /:: int, 1 :: int 


7, NE, primitive 


(B7) 


Pi(/)~int~eo 


inspection 



Wc now consider the non-dependent case of application, and return to X-expressions with 
dependent types in the next section. 

(1, F::/o-^/ 

(3){ f~primi tive(w), { wIcq'^^ else w!e = e } 

{4)0r f~cl osure(pQ,d,f), d^type'-EQ^Z?, pqHLET b^d IN E 

(5) else in case f is a symbolic value f %eQ = e} 

(0) FE^::t=>t 

The type-checking is done by -►El, which simply checks that the argument £q has the 
parameter type Iq of the function. There are three cases for evaluation, depending on whether/ 
is a primitive, a closure, or a symbolic value. 

If /is pr imi t i ve(w), -►ES tries to use the ^ rules for evaluating primitives to obtain the 
value of the primitive when applied to the argument value Cq, 

^ for each <arg, r€sult> pair in ihe primitive w 

(0) u'I^q'^c 

Because of the type-check, this will succeed for a properly constructed primitive unless €q is a 
symbolic value, i.e. contains a newc constant or a f i x. If no rule is applicable, the value is 
just vv I^Q, i.e., a more complex symbolic value. 

Thus the rules can be thought of as an evaluation mechanism for primitives which is 
programmed entirely outside the language, as is appropriate for functions which are primitive 
in the language. In its simplest form, as suggested by the rule above, there is one rule for 
each primitive and each argument value, which gives the result of applying that primitive to 
that value. More compact and powerful rules are also possible, however, as "^a-c illustrate. 

T^a-c 

(ao) fst!(e, ejj'^e im snd![ej, ej'^e (cO) rhs!(A/'-e)'2^e 

Note that the soundness of the type system depends on consistency between the types of a 
primitive (as expressed in rules like XEa-b), and the rules for that primitive C^a-b for fst 
and snd). For each primitive, a proof is required that the ^ rules give a result for every argu- 
ment of the proper type, and that the result is of the proper type. 

If /is closure(pQ, £), -*^E4 first computes a binding b-d^i.vc'^EQ from the argument in 
the current en\ ironment, and then evaluates the closure body E in the closure environment pg 
augmented by b. Note the parallel with -►M. which is identical except that the unknown argu- 
ment binding newc^d replaces the actual argument binding df^iypc^EQ, The success of the type- 
check made by -^14 when / was constructed ensures that the LET in ->E4 will type-check. 

If /is neither a primitive nor a closure, it must be a symbolic value. In this case there is not 
enough information to evaluate the application, and it is left in the form /%eQ. There is no 
hope for simplifying this in any larger context 



5.2.i Dcpendcni funciions 



We now return to the function rule, and consider the case in which the X-expression has a 
dependent type. 

->I (2) Tj =>/| , tj /, n'd-tvue(newc-7d) =>/ , 

(4)]. HI" newC/^d IN E :: t w^here newc is a new constani 

(0) (X IN E) :: ti=^cl osure(p. d, E) 

The only difference is that — >I2 applies instead of -^II; it deals with a function whose result 
type depends on the argument value, such as the swap function defined earlier by: 

(C) s\\ap:-\ Oi'AypQ X /2-^yP^)"*^(^i><'2"*^2^0 

The type expression for swap (following the Hrsi X) evaluates (by typela) to 

(4) (/jitypeX /2:type)^ 

closure(p, B': (/^itype X /2:type), let B' in /iX/2-^/2X/i) 

In this case the parameter type of swap is just (retype X (jAype): we do not use typeOf to 
replace it with typeXtype. This would be pointless, since the names and ^2 would remain 
buried in the closure, and to define equality of closures by the a-conversion rule of the X-cal- 
culus w ould take us afield to no good purpose. Furthermore, if elsewhere in the program there 
is another type expression which is supposed to denote the type of swap, it must also have — 
as its main operator, and a declaration with names corresponding to and tj. This is in con- 
trast with the situation for a non-dependent function type, which can be written without any 
names. The effea of leaving the names in, and not providing a-conversion between closures, is 
that two dependent function types must use the same names for the parameters if they are to 
be the same type. 

We do, however, need to compute an intended result type against which to compare the type 
of (1). This is done by applying the closure in (4) to nev^cl, a new constant which must be the 
same here and in the instannation of -^14. In this example, this application yields 

rhs!fst!newc1 XrhsIsndfnewcl-^rhsIsndlnewclXrhsIfstlnewcl 
which we call /. 

The body is typechecked as before, using -*I4. It goes like this 

(CI) ph-LET newc1#f ,:type X f,;type IN 2, :L 

X Xjl/j X ^2'^2"*^^2^^1 [^2' " 

rhs!fst!newc1 Xrhs!snd!newc1 — ^rhs!snd!newc1 Xrhs!fst!newc1 
(C2) PiHX X|:r| X x2:i2'^(2X^\ IN [xj, :: equality, 3, -^I 

rhs!fst!newc1 Xrhs!snd!newc1 — ^rhs!snd!newc1 Xrhs!fst!newc1 
where p^ - p[f^:type~rhs!fst!newci r2"tyP^~''hs!snd!newci] 

(C3) p^l LHT rieWC2tf x,:rhsMst'newclXx :rhs'snd'newc1 IN [^2, X|] \ l 4, lE 

rhs!snd!newc1 Xrhs!fst!newc1 
(C4) P2I— [;c2' rhs!snd!newclXrhs!fst!newc1 5. XE 

where P2 = Pjfx^Ths! fst Inewcl ~rhs ! fst !newc2, 
x^irhslsndlnewd ^rhs!snd!newc2] 



(C5) P2^^2 rhs!snd!newc1, p2t"'^i " rhs!fst!newc1 6, NE 

(C6) P2(x2)~rhs!snd!newci --^02' p2(^'i)~^hs!fst!newci -^e^^ inspection 

Observe that we carry symbolic forms (e.g. rhs!snd!newc1) of the values of the arguments for 
functions whose bodies are being typechccked. In simple examples such as (A) and (B), these 
values are never needed, but in a polymorphic function like swap tliey appear as the o^P^-^ of in- 
ner functions. Validity of the proof rests on the fact that two identical symbolic values always 
denote the same value. This in turn is maintained by the applicative nature of our system and 
the fact that we generate a different newc constant for each X-expression. 

A function with a dependent type d> f \s applied very much like an ordinary function. 

— >E il) ¥ W d ^ f^ , f|*d-tyDe(d#type'^EQ) / , 

(3) { f~primit i ve(w), { wleQ^i^eelse w!e = e} 

(4) 0r f~cl osure(pQ,d,£),d;^iyDe'-EQ=^Z?, Po»-LETb*d IN E ==>e 

(5) else in case f is a symbolic value f%eQ= e } 

(0, F£o::t=>e 

The only difference is that -♦E2 is used for the type computation instead of -^El. This line 
computes the result type of the application by applying /to the argument binding rf*type'-£'Q. It 
is exactly parallel to ^12, which computes the (symbolic) result type of applying the function 
to the unknown argument binding newc^^. We apply /to d^w^-^E^ rather than to because 
typela, which constructs ^►Z expects a binding as the argument of / The reason for this is that 
in -►E2 we don^t have an expected type for £q, but we do have a declaration d to which it can 
be bound. It is the evaluation of the binding dnv^^e-E^ that checks the type of the argument; 
there is no need for the explicit check £q : : / of -^El. 

5,2,4 Bindings and declarations 

The main rules for bindings show how to typecheck and evaluate a binding made from a 
declaration and an expression (:I) and how to use a binding in a LET to modify the environ- 
ment in which a subexpression is evaluated (:E). The tricky case of recursive bindings (:Ia and 
M2) is discussed in the next section. Rules :Ib and :Ic define the and > abbreviations; 
both are very simple. 

The rule for D-^E has four cases, depending on the form of the declaration value. 

:I (1) D=>£/, d~void , E :: void . nW-b 

(2)0rD=>J,d~A/:/, E::t, (N'-e) 6 

(3^or D=>^/, d~JiXJ2, [dj^type'-fst E, d2^type-snd E] :: d=>6 

(4)0r D^d^, do~^i*/, f#d,-type(dj#type~fst E ) ^di , 

d|Xd2 = i/, d/.type'-E::d=^6 

(0) D-E\\ d=>b 

If the declaration is void, E must have type void also, and the result is nil. If it is N\ /, E must 
have type r, and the result is the binding value AZ-e. These are the base cases. If the declaration 
is d.Xd., E must have a X type, and the result is the value of ^-fst t/2'^snd E\ Thus 
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/: iniX a: real - [3, 3.14] 

evaluates just like 

[/: inf-fst[3, 3.14], x: rcaWsnd [3. 3.14]] 

namely to [/^3, x~3.14]. All three of these cases yield d as the type of the binding. 

The rule for a dependent declaration is more complicated. It is based on the idea that in the 
context of a binding, d^^d^iff can be converted to d^Xd2 by applying / to fst E to obtain dj, 
ITie binding then has the type and value of d{Kd2^E, Thus 
r: type XX x: f - [int, 3] 

has type /: type X x: int and evaluates to [f'-int, x-3]. In this case the type of the binding is 
not d^, but the simpler cross type d-d^X^di, 

The rule for LET 5 IN £ has exactly the same cases. 

:E (1) B:: void , E :: /=*e 

<2) or B:: (A/: /q), rhs B =>eQ , p[N:tQ-eQ] I— E :: /=>e 

(3) or B:: d^Xd2. snd B =>Z?2^ LET fst B IN LETb2*d, IN E :: r=>e 

(4) or B:: d^it /, f*d.-type(fst B) , LET h^^,>c^, IN E :: t^e 

(0) let5in£":: t=>e 

If B has type void, the result is E in the current environment. If B has type N\ /q, the result is E 
in an environment modified so that has type /q and value obtained by evaluating rhs B, Thus 

LET /: int-3 in /+4 

has the same type and value that /-I-4 has in an environment where /: int^S, namely type int 
and value 7. 

If B has a cross type, the result is the same as that of a nested LET which First adds fst B to the 
environment and then adds snd 5. The rule evaluates snd B separately; if it said 

LET fst B IN LET snd B IN E 

the value of snd B would be affected by the bindings in fst 5. 

Finally, if B has a dependent type, that type is reduced to an ordinary cross type d^Xdj, and 
the result is the same as LET Bf IN £, where B'^b^dxd, has the same value as 5, but an ordinary 
cross type. The last case will never arise in a LET with an explicit binding expression for 5, 
since :I will always compute a cross type for such a B. However, when type-checking a func- 
tion such as 

X t\ type XX x: t int IN E 
-►14 requires a proof of 
LET newc#di*/iN E '.\ int 

where d^icfh the value of t\ type XX x\ u :E4 reduces this to 

LET neWCjOfr typex r.lst'newc IN £■ lOt 



J.2.5 Recursion 

Recursion is handled by a fixed point constructor in the value space, f U(f, d). If /is a func- 
tion with type d-^d, then f i x(/; d) has type d and is the fixed point of/, i.e.. 

The novelty is in the treatment of mutual recursion: d may declare any number of names, and 
correspondingly f ix(/; ^Z) binds all these names. Afi^if.d) value is the result of evaluating a 
RHC D-E binding. For example, 

Rl.C g\ (int^int) X h\ (int->ini)'-[ 
X x: inl->int IN IF x^OTllEN 1 El.SE x*h{x/2X 
X y\ int-*int IN ii ><2 TUEIN 0 ELSE g{y-2) ] 

has type 

g: (int— int) X h: (int-^int). 

Its value is a binding for g and h in which their values are the closures we would expect, with 
an environment p^^ that contains suitable recursive bindings for g and h. We shall soon see 
how this value is obtained, but for the moment let us just look at it: 

[ g-closure(p , x: intJF x = 0 THEN 1 ELSE x*Mx/2)), 
- cl 0 s u re(p , y: int, IF yil THEN 0 ELSE giy- 2)) ] 

where 

p =p[g: int— int-rhs!fst!f ix(/;t/). rt: int-^int-rshlsndlf ix(/; i/)], 
where 

d=g: (int -♦int) X h: (int-^lnt) 
and 

/= c 1 osu re(p. F': LET IN i-l 

X x: inl-^int IN IF x = O toen 1 ELSE x*Kx/2\ 
Xy: int-^intiN IF><2 THEN 0 ELSE ^-2) ] ) 

It is the f i X values inside p^^ that capture the infinite value of this recursive binding in our 
operational semantics. Of course, if g is looked up in p^j^ (as it will be, for example, when we 
compute H3)), we don't want to obtain rhslfstif i x(/; d) as its value; rather, we want 
c1osure(p^;,, x: int, ...). To get this we unroll the fix value, that is, we replace fix(f,d)by 
/(f ix(/; d)\v,hich evaluates to a closure. This unrolling is done by the rule, which also 
deals with the possibility that there may be an operator such as rhs outside. 

^> (1) eo^wlej, ei'^>e2, w!e2'^e 

(2)oreo~f ^^(f.d) , f^d-d(eQtfd)=^e 

(0) ^o'^^^ 

This rule unrolls rhslfstif i x(/; d) by first computing 

f ix(/;c/)'^>[g'-closure(p ,x: int, ...),/7"'Closure{p ,y:int,...)] 

using 'U>2 and ->E, and then simplifying rhslfstif ix(/; d) to closureCp^;^, x: int, ...) using 
'^>1, '^a and "^b. Thus, each time gov his looked up in p^^, the NI and rules unroll 
the f i X once, which is just enough to keep the computation going. 



For the persistent reader, we now present in detail thie evaluation of a simple recursive binding 
w ith one identifier, and an application of the resiiliing funciion. Since some of the expressions 
and vakies are rather long, we introduce names for them as we go. First the recursive binding: 

(D) RKC P: int->int"X A': inl-*int IN 
IF n<2 THl-N m Hl.St-; P{N-2) 

We can write this more compactly as 
RHC DP~L 

where 

DP=P: int-»-int, 

L = X n: int-»-int IN EXP, 

EXP = (IF «<2 THHN n F.I.SE P{n-2)) 

The table below is a proof that the value of (D) is 

p: int-»'int~closure(py-^, n: int, EXP) 
It has been abbreviated by omitting the # types on values which are used as expressions. The 
evaluation goes like this. First we construct the A-expression for the functional whose fixed 
point / we need (D3) and evaluate it to obtain a closure (D4). Then, according to :Ia2, we 
embed /in f ix(/; dp) and unroll it. This requires applying /to the f ix (D5), which gives rise 
to a double LEF (D6), one from the application and the other from the definition of the func- 
tional. After both lets have their effect on the environment, we have p^^, which contains the 
necessarj' fix value for P (D7-10). Now evaluating the X to obtain a closure value for P 
that contains pjp is easy (D12-13). 



(Dl) 

(Dla) 

(D2) 

(D3) 

(D4) 

(D5) 

(D6) 

(D7) 
(D8) 

(D9) 
(DIO) 

(Dll) 
(D12) 



pi— REC DP~L :: dp=>bp 

pl-DP=*>dp 

pHP: int-»int = DP 

pf-(X F': DP-* DP IN LET F' IN dp~L)=>f 
closure(p, f': dp, LET IN dp~L) = f 
pl-f(f ix(f, dp))=>bp 
pHF': dp~f ix(f, dp)=>bf, 

pH-LET bf IN LET IN dp~L=>bp 
f: dp~f ix(f, dp) = bf 

Pyh-LET F" IN dp~L=>bp 

where pj=p[F'\ dp~f ix(f, dp)] 
PyHrhs F Wrhsif i x(f. dp) 

Pjr^Hdp~L=>bp 
where pjp-pj{P: int-*'int~rhs!f i x(f, dp)] 



,HL= 



>cl osure(p^^, n: int, EXP) 



P~closure(py^, n: Int, EXP) = bp 



:Ia, la, 3, 5 

typeic, 2 

definition 

-1.4 

definition 

-♦E, 6 

#, :I, 7, :E.8 

definition 
:E. 9. 10 

-*E. NI 

:1. 11, 12 

definition 



Note that this evaluation does not depend on ha\ ing X-expressions for the values of the recur- 
sively bound names. It will work fine for ordinary expressions, such as 

REC i: int X /. int~[/+l, 0], 

which binds /:~1 and y:~0. However, it may not terminate. For instance, consider 
REC /: int X j: int~[/+ 1, i\ 



Nov\ look at an applicaiion of P: 

(H) i !:i ( lU c P: ini-^iiU'-X /V: int-^int IN 
IK n<2 I Hl-N m 1 I SH P(N-2) ) 
IN P{3) 

Th\s has type int and value 1, as wc see in the proof which follows. First we get organized to 
do the application with the proper recursive value for P (Fl-2). The application becomes a I.HT 
after P and 3 are evaluated (E3-5). This results in an environment p^^^ in which so we 
need to evaluate Pin-l) (E6-7). Looking up P we find a value v\hich can be unrolled (E8-9) to 
obtain the recursive value c1osure(py^, n: int, EXP) again (ElO-ll). Since /7-2=>1 (E12), we 
get the answer without any more recursion (E13-15). 



(El) ph-i.Hi RFC DP-L IN PO)) :: int=>1 :E, D, 2 

(E2) Pp^PO) :: int=>1 3, 4, 5 

where p^ = p[P: int-^int~c1 osure(py^, n: int, EXP)] 
(E3) p^l-P::int^int=>closure(p^^, n: int, EXP) NI 
(E4) p^l-n: int~3=>n-3 :l inti 

(E5) p^^l-LHTn~3 IN EXP:: int=>1 :E, 6 

(E6) P^sl-IF n<2 THEN n ELSE P{n-2)\\ int==>1 boolE, 7 

where p^^^^pfpin: int-3] 
(E7) p^^h-Pin-2):: int=>1 8, 12, 13 

(E8) Pn2^P'-' int-^int=>closure(p^^, n: int, EXP) NI. 9 

(E9) rhs!f ix(f, dp)'^>cl osure(p^^, n: int, EXP) '^>, 11 

(Ell) rhs!bp'i:rclosure(p^^, n: int, EXP) 

(E12) p^3H/7-2:; int=>1 NI, 
(E13) p^^h"LETn-1 IN EXP:: int=>1 :E. 14 

(E14) p^iH-IF n<2 THEN n ELSE P{n-2):: int=>1 boolE. 15 

where p^^ = p^^[n: int~1] 
(E15) p^il-A?:: int=>1 NI 



It should be clear to anyone who has followed us this far that we have given a standard opera- 
tional treatment of recursion. There is some technical interest in the way the fix is unrolled, 
and in the handling of mutual recursion. 

5.2.6 Inferring types 

The inference rules give a way of computing a type for any expression. In some cases, however, 
an expression may have additional types. In particular, this happens with types of the form 
^★/and typeOf!(^*y), because pairs with these dependent types also have ordinary cross 
types, which are the ones computed by the inference rules. To express this fact, there is an addi- 
tional inference rule :: which tells how to infer types that are not computed by the rest of the 
rules. 

:: (1) t~ f^d.^type (fst E) =>/2 . E " ^i^h 
(2) or t~typeOf!d , dtftype--E :: d 
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Introduction 



(ao)truc :: bool 



(bO) false :: booi 



(ci))0:; int 



0) 



(ao)nil :: void 



(0)[A:^,£'J::t^Xt2 



(i){ =>/^j , t^j:::rt/->/ , typcOf d^^type =>/q , t^-^t^ 
<2) or =>/i , "^dV /, f#d-*type(newc#d) =>/} , 

(3) 

(4)LHT newcj^diNE:: t 

where newc is a new constant 



<0) 



(Ar^iN£)::t^ 



(1) D=> J, d^^void , E :: void , 

(2) or D=>^/, d'^N : / , E :: t, 

(3) or D=>rf, d^d^Xd^, [dj#type~fst E^d^^type-snd E]::d 

(4) 0r D=><iQ, dQ~ J^i^/, f#d,-type(d^4^type~fst E ) =>rf2 , 

<5) d^ X dj = ^ , d # type - E : : d 



(0) 

(al)D=>t/ 
(a2) 

(aO) 



REC Z)-'^:: d 



0)1) [Bi , LET Bi IN B2] :: r 



(cl) 



E:: / 



Elimination 



(i)E :: bool , :: / , :: t , 

(2) 



(0) 



II 7:^ THEN ELSE Il^yA 



(ao) fst :: (/X/j)-^i (bo)snd :: {/^XO-*t 

(1) { F::/q->/ ,E,::t, 

(2) or F d > , f^#d-type(dA^type'-EQ) => / } , 
(3) 

(4) 
(5) 



(0) 



(1) B:: void , E :: / 

(2) orB::(A/: /q), rhs B =>e(j , pfNitQ-e^] l-E :: / 

(3) or B:: d^Xd^, snd B =>Z72JETfst B in let b^^a, jn E :: / 

(4) or B:: d^^ f#d,->type(fst B)=>d^, let b#d,xd, IN E :: / 

(5) 



<0) 



let E:: t=>e 



(bo)rhs :: (N: i)~*i 



these two are for convenience only 



(ai) X B': D^type in let B' in T 

(aO) D-*>T:: type 

(bi) X B': D-»typc in let B' in T 

(bO) D XX T:: type 

(cl) T :: type 

(co) : T:: type 



(1) 

(2) 
(3) 

(4) 

(5) D :: type 

(0) 



typcOf D :: type 



(1) p{N)^i-^e^ 

(2) 



(0) 



(1) [N:p(N)]l-E::/ 

(0) IMPORT iV IN F:: t 



Table 6a: Inference rules for type checking only 
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Elitnination 



(ao) tnje 



>true (bO) false 



^ false (c^)0 



(1) 



(2) 



(0) 



{H=>true y Kj=>^ or false , Y\^=>e} 

IF /^'THFN t:^ liLSH =>Q 



(aO) 



nil 



(1) 



F / F " / 



(aO) 



(bO) 



(1) 



{ r 



(2) orTj =>/j , f 

(3) 

(4) LET newc#diNE:; t 

where newc /5 a new constant 



(0) 



(X IN £) 



''closureCp, d, E) 



(1) 

(2) 



(3) { f:^primitive(>v) , { w!eQ"^e clsc w!c = e } 

(4) or f~ciosure(pQ,d,/0.d#type'"EQ=>6, p^l-LET b#d IN E =>e 

(5) else in case f is a symbolic value f%e« — e} 



(0) 



(1) D=>£/, d^^void , nil = (i) B::void, E 

iDOxX^^dA'^N: /, (N~e) = b (2)orR:: (/V: /q), rhsB==>eo, p[N:tQ~eQ] h-E 

(3) or D=>J, d~J^Xi/2, [d^tftype-fstE,d2#type-snd E]::d=>Z? (3)orB:: ^/^XJ^^snd B=i>62^LETfstBiN LETb2#d, inE 

(4) or T)=>d^, dQ~^jA /, f#d,-^type(dj#type'-fstE) =>^/2 , (4) Or B:: d^^ /, f#di--type(fst B) =>d^ , LET b^d^xdj IN E 

(5) djXd2-£/, dj!^type-'E::d=^6 (5) 

=>b (0) 



(0) 



LET BmE 



(ai)D =>rf, (X F':D->D in let F' in d#type-E 

(a2) fjyd^d(f i x(F, d)#d)=>b 



(bO) 



(aO) RECZ)'-^ 
(bl) [Bi , LET Bi IN Hi] => b 



(cl) 



5l : 5; =>b 



{dS) N \ E .'. {\^\ t)=> N '-e //r^^e f>vo are for convenience only 



>e 
>e 

>e 

>€ 



(ai) X B': D-»type in let B' in T=>/ 
(aO) /)^>r =»d^f 
(bl) X B': D-»typc in let B' in T=>/ 
(bO) DXXT =>d*f 

(cl) 



(CO) TV: T 



(1) { d~void , void = / 

(2) ord~A/ : / 

(3) or d^d^Xd^ , typeOf d 1 #type X typeOf dj^type ==>/ 

(4) or d~i/^* /, typeOfId = /} , 

(5) 

(0) typeOfD =>t 



(1) p(N)^t-'eQ 

(2) {cq*^ > e else Cq = 

(0) N =>e 



(1) [N: p(N)ll-E 



(0) IMPORT IN E 



Table 6b: Inference rules for evaluation only 
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typcVR = record case tag of 

boolean: 
typcConst: 
primitive: 
nil: 

{(el e2, )} pair: 

closure: 

{n:'-e} binding: 
{w!e} bang: 
{tlXt2} cross: 
{d* f} dcross: 
{t-*tO} arrow: 
{d>f} darrow: 
{n: t} sdccl: 
{f%e} symbolicApply: 

end 



type V = pointer to VR; 
type Binding = VR 

type Decl = VR 



{either 
or 

{ either 
or 
or 
or 



(v: (true, false) ), 

(v: (bool, int, void, type)), 

(w: Primitive), 

(), 

(cl,e2: V), 

(rho: Binding, domain: Dccl, body: Ex), 

(n: Name, e: V), 

{w:V {Primitive}, e: V), 

(tl, t2: Type), 

(d: Dccl, f: V {closure}), 

(domain, range: Type), 

(domain: Dccl, frangc: V {closure}), 

(n: Name, t: Type), 

(f, c: V), 



V {binding} 

V {pair[Binding, Binding]} } 

V {sdccl} 
V{nil} 

V {cross[Decl, Dccl]} 

V {dcross[Dccl, V.closure]} } 



type Type = VR 


{ either 


V {typeConst} 




or 


V {cross} 




or 


V {dcross} 




or 


V {arrow} 




or 


V {darrow} 




or 


V {decl} } 


{or any of these could be V {bang} or V {symbolicApply} } 


type ExR = record case tag of 






constant: 


(c: (taie, false, ...)), 


{IF E THEN El ELSE E2} 


if: 


(E: Ex, El, E2: Ex), 


{El, E2} 


pair: 


(El, E2: Ex), 


{\ Tl IN E} 


lambda: 


(Tl, E: Ex), 


{FE} 


apply: 


(F. E: Ex), 


{D~E} 


binding: 


(D, E: Ex), 


{REC D~E} 


rec: 


(D, E: Ex), 


{Bl ; B2} 


semi: 


(B1,B2: Ex), 


{LET B IN E} 


let: 


(B, E: Ex), 




name: 


(N: Name), 


{import N in E} 


import: 


(N: Name, E: Ex), 


{D-»)TO} 


darrow: 


(D, TO: Ex), 


{N:T} 


sdecl: 


(N:Name,T: Ex) 


{D1XXD2} 


dcross: 


(Dl, D2: Ex), 


{v*.} 


ev: 


(v: V,T: Type) 




end 





type Ex = pointer to ExR; 



Table 7a; Pascal declarations for the Pebble semantics 



procedure I(HF: Ex, rho: Binding, typcOnly: boolean, var t; Type, vjr v: V); 
var 

tO: Type; eO: V; 
tF:TypeJ: V; 
tR: Type, e: V; 
B: Ex; 
u: Type; 
b: Binding; 
Lei: Ex; 
begin 

case EEt.tagof 



apply: begin 

l(HEt.F,rho, tF, 0; 
I(FEt.E, rho, tE, e); 
case tFt.tag of 
arrow: 

if iFt.domain = tE or HasType(E, tFt .domain) 

then tO: = tFt. range else Fail; 
darrow:begin 

B: = Bind(EV(tFT.d, type), EEt.E)); 

I( Apply(F, B), rho, tx, v) end; 
else Fail; 
end; 

if not typeOnly then case ft.tag of 
primitive: begin 

new(cO, bang); eOt.w: = fT.w; eOt.e: = e; 

eO: = Simplify(eO) end; 
closure: begin 

l(Bind(EV(ft.domain, type), E), rho, tx, b); 

new(Let, let); Ictt.B: = EV(b, ft.domain); lett.D: = fr.body; 

I(Let, ft.rho, tx, eO) end; 
else begin new(eO, symbolicApply); eOt.f: = f; eOt.e: = e end; 
end; 

else eO: = notDone; 
end; 



{F::r-/,,E::t} 



{F::i/^/Q, fQ#d: = typc=FQ} 

{d*type-E=^} 



{ f^^iprimltive(H'), } 
{ wie'^eQ else wle^e^} 
{ f:=:closure(pQ, £)} 

{ d#type'-E=^!>} 
{Poh-LETbxrdINEo=>eo} 



end } 

procedure Bind(D, E: Ex): Ex; begin ncw(Bind, binding); Bindt.D: = D; Bindt.E: =E end; 
procedure Apply (F, E: Ex): Ex; begin new( Apply, apply); Applyt.F:=F; Applyt.E: = Eend; 
proccdurcEV (v: V, t: Type): Ex; begin ncw(EV, cv); EVt.v: = v; EVt.t: = t end; 



Table 7b: Pascal code for 
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::1 turns d^funo dXt by applying /to fst E to compute /; tlien it checks that E has type dXt, 
This is a reflection of the fact already discussed, that a pair may have many dependent types, 
as well as its "basic" cross type. 

::2 deals with the case in which typeOf cannot be evaluated immediately. typeE tells us that 
typeOf (x: int) = int 
typeOf(x: int X >' : real) = intXreal. 

but 

typeOf (/: type XX x: t) = typeOf!(r: type*closure(p, t: type, x: /)) 

because there is no way to compute typeOf (x: t) without a value for /. One might think that 
there would be some way to obtain 

t\ type*c1osure(p, t\ type, 0) 

as the value, but consider 

typeOf (6: bool XX IF 6 THEN x\ int ELSE y: real) 
Once we have a pair E in hand, however, it is easy to check whether it has type typeOf!^/*/ 
simply by seeing whether ^★/-£ type-checks. This expression is checked by :I4-5, which turns 
rf*/into JX/, much as ::1 does. 

5. 3 Type checking vs evaluation 

There is a subtle interaction between type-checking and evaluation in Pebble, which is illus- 
trated in Table 6. It is possible to write inference rules which only do type-checking; they must 
be able to call on the evaluator to evaluate type expressions. It is also possible to write 
inference rules which only do evaluation, on the assumption that type-checking has already 
been done. Table 6 repeats the entire set of inference rules twice (except for the auxiliary 
rules), once with only the parts needed for type-checking, and once with only the parts needed 
for evaluation. It is important to note that the type-checking rules contain calls on the 
evaluator, in the form of occurrences of the symbol. 

Note that most of the rules for - and LET are needed for both purposes. This is because these 
rules set up environments which bind names, and there is no way to tell whether a given name 
will be needed to evaluate a type expression. In fact, the rules as written in Table 6 are 
pessimistic; during type-checking it is possible to defer evaluation of the right hand side of a 
binding until the value of that name is actually needed. As examples (B) and (C) above 
suggest, this usually happens only when the name denotes a type. 

5.4 Deterministic evaluation 

As we mentioned in §5.1.2, it is possible to construct a deterministic evaluator from the 
inference rules. Table 7 gives Pascal declarations for such an evaluator, together with a 
fragment of the code, that which corresponds to -*E. It is interesting to note the close 
correspondence between the inference rule and the Pascal code, as well as the fact that the 
code is only about twice as large. 



6. Conclusion 



Wc have presented both an informal and a formal treatment of the Pebble language, which 
adds to the typed lambda calculus a systematic treatment of sets of labeled values, and an 
explicit form of polymorphism. Pebble can give a simple account of many constructs for 
programming in the large, and we have demonstrated this with a number of examples. The 
language derives its power from its ability to manipulate large, structured objects without 
delving into their contents, and from tlie uniform use of X-abstraction for all its entities. 

A number of areas are open for further work: 

Labelled unions or sum types, discussed briefly in § 3.5. 

Abbreviations which allow explicit type parameters to be omitted from applications of 
polymorphic functions. 

A sub-type or type inheritance relation, perhaps along the lines suggested by Cardelli. 
Assignment, discussed briefly in § 3.6. 

Exception-handling, probably as an abbreviation for returning a union result and testing 
for some of the cases. 

Concurrency. We do not have any ideas about how this is related to the rest of Pebble. 
A more mathematical semantics for the language. 

Proof of the soundness of the type-checking, and an exploration of its limitations. 
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